Use Image streaming to pull container images

This page shows you how to use Image streaming in Google Kubernetes Engine (GKE) to pull container images by streaming the image data as your applications need it.

Autopilot clusters automatically use Image streaming to pull eligible images. The instructions on this page about enabling and disabling image streaming apply to Standard clusters.

Overview

Image streaming is a method of pulling container images in which GKE streams data from eligible images as requested by your applications. You can use Image streaming to allow your workloads to initialize without waiting for the entire image to download, which leads to significant improvements in initialization times. The shortened pull time provides you with benefits including the following:

  • Faster autoscaling
  • Reduced latency when pulling large images
  • Faster Pod startup

With Image streaming, GKE uses a remote filesystem as the root filesystem for any containers that use eligible container images. GKE streams image data from the remote filesystem as needed by your workloads. Without Image streaming, GKE downloads the entire container image onto each node and uses it as the root filesystem for your workloads.

While streaming the image data, GKE downloads the entire container image onto the local disk in the background and caches it. GKE then serves future data read requests from the cached image.

When you deploy workloads that need to read specific files in the container image, the Image streaming backend serves only those requested files.

Requirements

You must meet the following requirements to use Image streaming in GKE Autopilot and Standard clusters:

  • You must enable the Container File System API.

    Enable Container File System API

  • You must use the Container-Optimized OS with containerd node image. Autopilot nodes always use this node image.

  • Your container images must be stored in standard or remote repositories in Artifact Registry or in public registries on Docker Hub.

  • If you enable private nodes on your cluster, you must enable Private Google Access on the subnet for your nodes to access the Image streaming Service.

  • If VPC Service Controls protects your container images and you use Image streaming, you must also include the Image streaming API (containerfilesystem.googleapis.com) in the service perimeter.

  • If the GKE nodes in the cluster don't use the default service account, you must ensure that your custom service account has the Service Usage Consumer (roles/serviceusage.serviceUsageConsumer) IAM role in the project that hosts the container image.

Limitations

  • Container images that use the V2 Image Manifest, schema version 1 are not eligible.
  • Container images with duplicate layers are not supported. GKE downloads these images without streaming the data. Check your container image for empty layers or duplicate layers.
  • If your workloads read many files in an image during initialization, you might notice increased initialization times because of the latency added by the remote file reads.
  • If your workloads require a large proportion of the image to be available before code can execute, you might see a delay between the time that the kubelet starts the container and the time that the container actually starts to send logs.
  • You might not notice the benefits of Image streaming during the first pull of an eligible image. However, after Image streaming caches the image, future image pulls on any cluster benefit from Image streaming.
  • GKE Standard clusters use the cluster-level configuration to determine whether to enable Image streaming on new node pools created using node auto-provisioning. However, you cannot use workload separation to create node pools with Image streaming enabled when Image streaming is disabled at the cluster level.

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the gcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.

Enable Image streaming on clusters

You can enable Image streaming on new or existing Standard clusters by using the gcloud CLI --enable-image-streaming flag, or using the Google Cloud console. When you create a cluster with the --enable-image-streaming flag, Image streaming will be enabled in the default node pool. New node pools you create will also have Image streaming enabled unless you disable it when creating the node pools.

All Autopilot clusters use Image streaming to pull eligible images. For instructions, refer to Set the version and release channel of a new Autopilot cluster. The following instructions only apply to GKE Standard clusters.

You can enable Image streaming on existing clusters that meet the requirements using either the gcloud CLI or the Google Cloud console.

gcloud

To update an existing cluster to use Image streaming, run the following command using the gcloud CLI:

gcloud container clusters update CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --enable-image-streaming

Replace the following:

  • CLUSTER_NAME: the name of your cluster.
  • CONTROL_PLANE_LOCATION: the location of the control plane of your cluster.

Console

  1. Go to the Google Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click the name of the cluster you want to modify.

  3. On the Clusters page, in the Features section, click next to Image streaming.

  4. In the Edit Image streaming dialog, select the Enable Image streaming checkbox.

  5. Click Save changes.

After you modify the cluster, GKE enables Image streaming on your existing node pools automatically by default. If you explicitly enabled or disabled Image streaming on individual node pools, those node pools don't inherit the changes to the cluster-level setting.

Changing the Image streaming setting respects maintenance availability when updated at the cluster level, but not at the node pool level.

This change requires recreating the nodes, which can cause disruption to your running workloads. For details about this specific change, find the corresponding row in the manual changes that recreate the nodes using a node upgrade strategy and respecting maintenance policies table. To learn more about node updates, see Planning for node update disruptions.

Verify Image streaming is enabled on a cluster

You can check whether Image streaming is enabled at the cluster level using either the gcloud CLI or the Google Cloud console.

gcloud

Run the following command:

gcloud container clusters describe CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --flatten "nodePoolDefaults.nodeConfigDefaults"

Replace the following:

  • CLUSTER_NAME: the name of your cluster.
  • CONTROL_PLANE_LOCATION: the location of the control plane of your cluster.

The setting is enabled if the output is similar to the following:

gcfsConfig:
  enabled: true
...

The setting is disabled if the output is similar to the following:

gcfsConfig: {}
...

Console

  1. Go to the Google Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click the name of the cluster you want to check.

  3. On the Clusters page, in the Features section, next to Image streaming it will show whether the setting is enabled.

Enable Image streaming on node pools

By default, node pools inherit the Image streaming setting at the cluster level. You can enable or disable Image streaming on specific node pools using the gcloud CLI.

On a new node pool

To create a new node pool with Image streaming enabled, run the following command:

gcloud container node-pools create NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --image-type="COS_CONTAINERD" \
    --enable-image-streaming

Replace the following:

  • NODE_POOL_NAME: the name of your new node pool.
  • CLUSTER_NAME: the name of the cluster for the node pool.
  • CONTROL_PLANE_LOCATION: the Compute Engine location of the control plane of your cluster. Provide a region for regional clusters, or a zone for zonal clusters.

On an existing node pool

You can enable Image streaming on existing node pools that meet the requirements.

To update an existing node pool to use Image streaming, run the following command:

gcloud container node-pools update POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --enable-image-streaming

Replace the following:

  • POOL_NAME: the name of your node pool.
  • CLUSTER_NAME: the name of the cluster for the node pool.
  • CONTROL_PLANE_LOCATION: the location of the control plane of your cluster.

Changing the Image streaming setting respects maintenance availability when updated at the cluster level, but not at the node pool level.

This change requires recreating the nodes, which can cause disruption to your running workloads. For details about this specific change, find the corresponding row in the manual changes that recreate the nodes using a node upgrade strategy without respecting maintenance policies table. To learn more about node updates, see Planning for node update disruptions.

Verify Image streaming is enabled on a node pool

Check whether Image streaming is enabled for a node pool:

gcloud container node-pools describe POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION

Replace the following:

  • POOL_NAME: the name of your node pool.
  • CLUSTER_NAME: the name of the cluster for the node pool.
  • CONTROL_PLANE_LOCATION: the location of the control plane of your cluster.

The setting is enabled if the output is similar to the following:

gcfsConfig:
  enabled: true
...

The setting is disabled if the output is similar to the following:

gcfsConfig: {}
...

Schedule a workload using Image streaming

After you enable Image streaming on your cluster, GKE automatically uses Image streaming when pulling eligible container images from Artifact Registry without requiring further configuration.

GKE adds the cloud.google.com/gke-image-streaming: "true" label to nodes in node pools with Image streaming enabled. On GKE Standard, if you enable or disable Image streaming on specific node pools so that your cluster has a mix of nodes that use Image streaming and nodes that don't, you can use node selectors in your deployments to control whether GKE schedules your workloads on nodes that use Image streaming.

In the following example, you schedule a Deployment that uses a large container image on a cluster with Image streaming enabled. You can then optionally compare the performance to an image pull without Image streaming enabled.

  1. Create a new cluster with Image streaming enabled:

    gcloud container clusters create CLUSTER_NAME \
        --location=CONTROL_PLANE_LOCATION \
        --enable-image-streaming \
        --image-type="COS_CONTAINERD"
    
  2. Get credentials for the cluster:

    gcloud container clusters get-credentials CLUSTER_NAME \
        --location=CONTROL_PLANE_LOCATION
    
  3. Save the following manifest as frontend-deployment.yaml:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: frontend
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: guestbook
          tier: frontend
      template:
        metadata:
          labels:
            app: guestbook
            tier: frontend
        spec:
          containers:
          - name: php-redis
            image: us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5
            env:
            - name: GET_HOSTS_FROM
              value: "dns"
            resources:
              requests:
                cpu: 100m
                memory: 100Mi
            ports:
            - containerPort: 80
    

    The gb-frontend container image is 327 MB in size.

  4. Apply the manifest to your cluster:

    kubectl apply -f frontend-deployment.yaml
    
  5. Verify that GKE created the Deployment:

    kubectl get pods -l app=guestbook
    

    The output is similar to the following:

    NAMESPACE    NAME                          READY    STATUS       RESTARTS    AGE
    default      frontend-64bcc69c4b-pgzgm     1/1      Completed    0           3s
    
  6. Get the Kubernetes event log to see image pull events:

    kubectl get events --all-namespaces
    

    The output is similar to the following:

    NAMESPACE  LAST SEEN  TYPE    REASON          OBJECT                                                 MESSAGE
    default    11m        Normal  Pulling         pod/frontend-64bcc69c4b-pgzgm                          Pulling image "us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5"
    default    11m        Normal  Pulled          pod/frontend-64bcc69c4b-pgzgm                          Successfully pulled image "us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5" in 1.536908032s
    default    11m        Normal  ImageStreaming  node/gke-riptide-cluster-default-pool-f1552ec4-0pjv    Image us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5 is backed by image streaming.
    ...
    

    In this output:

    • The Pulled event shows the time taken for Image streaming to pull the image.
    • The ImageStreaming event shows that the node uses Image streaming to serve the container image.

Compare performance with standard image pulls

In this optional example, you create a new cluster with Image streaming disabled and deploy the frontend Deployment to compare performance with Image streaming.

  1. Create a new cluster with Image streaming disabled:

    gcloud container clusters create CLUSTER2_NAME \
        --location=CONTROL_PLANE_LOCATION \
        --image-type="COS_CONTAINERD"
    
  2. Get credentials for the cluster:

    gcloud container clusters get-credentials CLUSTER2_NAME \
        --location=CONTROL_PLANE_LOCATION
    
  3. Deploy the frontend Deployment from the previous example:

    kubectl apply -f frontend-deployment.yaml
    
  4. Get the Kubernetes event log:

    kubectl get events --all-namespaces
    

    The output is similar to the following:

     NAMESPACE  LAST SEEN  TYPE    REASON     OBJECT                             MESSAGE
     default    87s        Normal  Pulled     pod/frontend-64bcc69c4b-qwmfp      Successfully pulled image "us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5" in 23.929723476s
    

    Notice the time GKE took to pull the entire image. In this example output, GKE needed almost 24 seconds. With Image streaming enabled, GKE only needed 1.5 seconds to pull the image data that the workload required to start.

Clean up

To avoid charges, delete the clusters you created in the previous examples:

gcloud container clusters delete CLUSTER_NAME CLUSTER2_NAME \
    --location=CONTROL_PLANE_LOCATION

Replace the following:

  • CLUSTER_NAME: the name of your first cluster.
  • CLUSTER2_NAME: the name of your second cluster.
  • CONTROL_PLANE_LOCATION: the location of the control plane of the clusters.

Disable Image streaming

If you use GKE Autopilot, you can't disable Image streaming on individual clusters. You can disable the Container File System API, which disables Image streaming for the entire project.

If you use GKE Standard clusters, you can disable Image streaming on individual clusters or specific node pools, as described in the following sections.

Disable Image streaming on a GKE Standard cluster

You can disable Image streaming on existing GKE Standard clusters using the gcloud CLI or the Google Cloud console.

gcloud

To disable Image streaming on an existing cluster, run the following command:

gcloud container clusters update CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --no-enable-image-streaming

Replace the following:

  • CLUSTER_NAME: the name of your cluster.
  • CONTROL_PLANE_LOCATION: the location of the control plane of your cluster.

Console

  1. Go to the Google Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click the name of the cluster you want to modify.

  3. On the Clusters page, under Features, click next to Image streaming.

  4. In the Edit Image streaming dialog box, clear the Enable Image streaming checkbox.

  5. Click Save changes.

Changing the Image streaming setting respects maintenance availability when updated at the cluster level, but not at the node pool level.

This change requires recreating the nodes, which can cause disruption to your running workloads. For details about this specific change, find the corresponding row in the manual changes that recreate the nodes using a node upgrade strategy and respecting maintenance policies table. To learn more about node updates, see Planning for node update disruptions.

On a new node pool

To disable Image streaming when creating a new node pool, specify the --no-enable-image-streaming flag, such as in the following command:

gcloud container node-pools create NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --no-enable-image-streaming

On an existing node pool

To disable Image streaming on an existing node pool, run the following command:

gcloud container node-pools update NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --no-enable-image-streaming

Replace the following:

  • NODE_POOL_NAME: the name of your node pool.
  • CLUSTER_NAME: the name of the cluster for the node pool.
  • CONTROL_PLANE_LOCATION: the location of the control plane of your cluster.

Changing the Image streaming setting respects maintenance availability when updated at the cluster level, but not at the node pool level.

This change requires recreating the nodes, which can cause disruption to your running workloads. For details about this specific change, find the corresponding row in the manual changes that recreate the nodes using a node upgrade strategy without respecting maintenance policies table. To learn more about node updates, see Planning for node update disruptions.

Memory reservation for Image streaming

GKE reserves memory resources for Image streaming in addition to the memory that is reserved for node system components to run. GKE does not reserve additional CPU resources for Image streaming. In GKE Standard clusters, this reservation changes the memory resources that are available for you to request in your Pods. In GKE Autopilot, GKE manages system allocations, so there's no impact to scheduling your workloads.

For details about the memory reservations GKE makes for node components, see Standard cluster architecture.

In nodes that use Image streaming, GKE makes the following additional memory reservations for new reservations:

  • No additional memory for machines with less than 1 GiB of memory
  • 1% of the first 4 GiB of memory
  • 0.8% of the next 4 GiB of memory (up to 8 GiB)
  • 0.4% of the next 8 GiB of memory (up to 16 GiB)
  • 0.24% of the next 112 GiB of memory (up to 128 GiB)
  • 0.08% of any memory above 128 GiB

Troubleshooting

The following sections provide advice on troubleshooting Image streaming. For advice on troubleshooting standard image pulls, see Troubleshoot image pulls.

GKE doesn't use the Image streaming file system

If your GKE event log doesn't show the Image streaming events, your image is not backed by the remote file system. If GKE previously pulled the image on the node, this is expected behavior because GKE uses the local cache of the image for subsequent pulls instead of using Image streaming. You can verify this by looking for Container image IMAGE_NAME already present on machine in the Message field for the Pod Pulled event.

If you don't see the Image streaming event during the first image pull on the node, ensure that you meet the requirements for Image streaming. If you meet the requirements, you can diagnose the issue by checking the logs of the Image streaming Service (named gcfsd):

  1. Go to the Logs Explorer page in the Google Cloud console:

    Go to Logs Explorer

  2. In the Query field, specify the following query:

    logName="projects/PROJECT_ID/logs/gcfsd"
    resource.labels.cluster_name="CLUSTER_NAME"
    

    Replace the following:

    • PROJECT_ID: The name of your project.
    • CLUSTER_NAME: The name of your cluster.
  3. Click Run query.

You can also check the gcfsd logs using Logs Explorer:

  1. Go to the Logs Explorer in the Google Cloud console:

    Go to Logs Explorer

  2. In the Query field, specify the following query:

    logName="projects/PROJECT_ID/logs/gcfsd"
    

    Replace PROJECT_ID with your Google Cloud project ID.

PermissionDenied

If the gcfsd logs display an error message similar to the following, the node doesn't have the correct API scope. GKE pulls container images for workloads without using Image streaming.

level=fatal msg="Failed to create a Container File System client: rpc error:
code = PermissionDenied desc = failed to probe endpoint: rpc error: code = PermissionDenied
desc = Request had insufficient authentication scopes."

You can fix this by granting the correct scope to the node to allow it to use Image streaming. Add the devstorage.read_only scope to the cluster or node pool, similar to the following command:

gcloud container node-pools create NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=CONTROL_PLANE_LOCATION \
    --image-type="COS_CONTAINERD" \
    --enable-image-streaming \
    --scopes="https://www.googleapis.com/auth/devstorage.read_only"

FailedPrecondition

If you notice an error message with code = FailedPrecondition, the image wasn't imported to the Image streaming remote file system.

You might notice this error if you tried to use Image streaming with an existing node pool. If a node in the node pool already has the container image on-disk, GKE uses the local image instead of using Image streaming to get the image.

To fix this, try the following:

  • Wait a few minutes and try to deploy your workload again.
  • Add new nodes or a new node pool and schedule the workload on those nodes.

InvalidArgument

If you notice an error message with code=InvalidArgument, the container image your workload uses is not eligible for Image streaming. Ensure that the image meets the requirements. If your image is not on Artifact Registry, try migrating to Artifact Registry.

backend.FileContent failed

The following error might appear when reading container files with Image streaming enabled:

level=error msg="backend.FileContent failed" error="rpc error: code = ResourceExhausted desc = Quota exceeded for quota metric 'Content requests per project per region' and limit 'Content requests per project per region per minute per region' of service 'containerfilesystem.googleapis.com' for consumer 'project_number:PROJECT_NUMBER'." layer_id="sha256:1234567890" module=gcfs_backend offset=0 path=etc/passwd size=4096

This error indicates project has exceeded the quota required to read files from the remote container file system service. To help resolve this issue, request a quota adjustment to increase the following quota values:

  • Content requests per project per region per minute per region
  • Content requests per project per region

GKE downloads the image without streaming the data

Container images using customer-managed encryption keys (CMEK) are only eligible for Image streaming on GKE version 1.25.3-gke.1000 or later. Container images with duplicate layers are not eligible for Image streaming. See the Limitations for more information.

Checking for empty layers or duplicate layers

To check the container image for empty layers or duplicate layers, run the following command:

docker inspect IMAGE_NAME

Replace IMAGE_NAME with the name of the container image.

In the output of the command, inspect the entries under "Layers".

If one of the entries exactly matches the following"sha256" output, the container image has an empty layer and is not eligible for Image streaming.

"Layers": [
  ...
  "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4",
  ...
]

If there are duplicate entries like in the following example, the container image has duplicate layers and is not eligible for Image streaming.

"Layers": [
  "sha256:28699c71935fe3ffa56533db44ad93e5a30322639f7be70d5d614e06a1ae6d9b",
  ...
  "sha256:28699c71935fe3ffa56533db44ad93e5a30322639f7be70d5d614e06a1ae6d9b",
  ...
]

mv command and renameat2 system calls fail on symlink files

For GKE nodes running version 1.25 and later, when Image streaming is enabled, the mv command and renameat2 system call might fail on symlink files in container images with the error message "No such device or address". The issue is caused by a regression on recent Linux kernels.

These system calls are not common, so the majority of images are not affected by this problem. The issue typically happens on container initialization stages when an application is being prepared to run and move around files. It is not possible to test the image locally, so GKE recommends to use Image streaming on test environments to find the issue before the image is used in production.

The fix is available in the following GKE patch versions:

  • 1.25: 1.25.14-gke.1351000 and later
  • 1.26: 1.26.9-gke.1345000 and later
  • 1.27: 1.27.6-gke.100 and later
  • 1.28: 1.28.1-gke.1157000 and later

Alternatively, to mitigate this issue for any affected workloads, you can try replacing the code leading to the renameat2 system call. If you cannot modify the code, you must disable Image streaming on the node pool to mitigate the issue.

What's next