Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase Gateway CRD Infrastructure Annotation Limit #2734

Closed
jschwartzy opened this issue Jan 23, 2024 · 20 comments
Closed

Increase Gateway CRD Infrastructure Annotation Limit #2734

jschwartzy opened this issue Jan 23, 2024 · 20 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@jschwartzy
Copy link

What would you like to be added:
In the experimental branch for Gateway CRD resources, the maximum number of annotations (properties) is set to 8.
https://github.com/kubernetes-sigs/gateway-api/blob/main/config/crd/experimental/gateway.networking.k8s.io_gateways.yaml#L181

This enhancement is requesting the limit be increased in order to accommodate common Cloud use-cases. A higher number of properties (~20) would be ideal.

Why this is needed:
Downstream resources (Cloud Load Balancers, for example) often require many annotations to be configured appropriately.
For example, in an AWS environment, the responsibility of creating the underlying Network Load Balancer (NLB) or Application Load Balancer (ALB) to fulfill the Gateway object is passed to the AWS Load Balancer Controller. This Controller uses annotations to configure the load balancer properties, such as health checks, security group associations, etc.

Here are some examples of these types of configurations we typically see in our clusters:

Application (L7) Load Balancer Annotations:

    alb.ingress.kubernetes.io/actions.myservice-80: '{"forwardConfig":{"targetGroups":[{"serviceName":"service","servicePort":8080,"weight":100}]},"type":"forward"}'
    alb.ingress.kubernetes.io/actions.ssl-redirect: '{"Type": "redirect", "RedirectConfig":
      { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}'
    alb.ingress.kubernetes.io/backend-protocol: HTTP
    alb.ingress.kubernetes.io/certificate-arn: ${cert_arn}
    alb.ingress.kubernetes.io/healthcheck-interval-seconds: "15"
    alb.ingress.kubernetes.io/healthcheck-path: /
    alb.ingress.kubernetes.io/healthcheck-port: traffic-port
    alb.ingress.kubernetes.io/healthcheck-timeout-seconds: "5"
    alb.ingress.kubernetes.io/healthy-threshold-count: "2"
    alb.ingress.kubernetes.io/ip-address-type: ipv4
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'
    alb.ingress.kubernetes.io/load-balancer-name: my-loadbalancer-name
    alb.ingress.kubernetes.io/scheme: internal
    alb.ingress.kubernetes.io/security-groups: sg-0123456789abcde01
    alb.ingress.kubernetes.io/success-codes: "200"
    alb.ingress.kubernetes.io/tags: env=dev,type=alb,app=my-app
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/unhealthy-threshold-count: "2"
    kubernetes.io/ingress.class: alb

Network (L4) Load Balancer Annotations:

  service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: false
  service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags:
    env=dev,type=alb,app=my-app
  service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
  service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: false
  service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: 3
  service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: 10
  service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: /
  service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: traffic-port
  service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: tcp
  service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: 10
  service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: 3
  service.beta.kubernetes.io/aws-load-balancer-ip-address-type: ipv4
  service.beta.kubernetes.io/aws-load-balancer-name: my-nlb
  service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
  service.beta.kubernetes.io/aws-load-balancer-scheme: internal
  service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=true
  service.beta.kubernetes.io/aws-load-balancer-target-node-labels: type=cpu
  service.beta.kubernetes.io/aws-load-balancer-type: external
  service.beta.kubernetes.io/load-balancer-source-ranges: 172.0.0.0/8, 10.0.0.0/10

AWS Load Balancer Controller Annotation reference:
https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.6/guide/ingress/annotations/
https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.6/guide/service/annotations/

@jschwartzy jschwartzy added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 23, 2024
@howardjohn
Copy link
Contributor

I don't mind this in genera,l but a lot of those example annotations ought to be replaced by first class API fields that already exist

@jschwartzy
Copy link
Author

I don't mind this in genera,l but a lot of those example annotations ought to be replaced by first class API fields that already exist

Appreciate the feedback.

Typically, we follow the guidelines of the cloud provider and their associated documentation which is often built on a version of Kubernetes that is older than the current release and those API field may not yet be available.

In general, I agree reducing the number of annotations and replacing with API fields is a good practice - however, I still think we're going to need more than 8.

Would you mind providing an example of which annotation(s) are covered by Service or Ingress classes?

Thank you!

@howardjohn
Copy link
Contributor

backendProtocol: service.ports.appProtocol
certificate-arn: gateway.spec.tls
actions: HTTPRoute
ip-address-type: gateway.spec.address
listen-ports: gateway.spec.listenres

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 23, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 23, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 22, 2024
@dudicoco
Copy link

@howardjohn can we please reopen this? Currently this is a blocker for moving some implementations from ingress to gateway API due to the necessity of the annotations.

@howardjohn
Copy link
Contributor

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Jan 22, 2025
@k8s-ci-robot
Copy link
Contributor

@howardjohn: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jscaltreto
Copy link

I ran into this today while trying to use gateway-api with the AWS LB controller. I was able to whittle it down to 8 annotations, but had to sacrifice some niceties like service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags and service.beta.kubernetes.io/aws-load-balancer-name which (as far as I know) cannot be configured with resource fields. Though I'm past the immediate hump, that doesn't leave any additional room for something like external-dns.

It looks like the limit of 8 originates from this comment, where it's acknowledged that it may be necessary to increase the limit in the future should a use case present itself.

AFAIK Kubernetes doesn't impose a strict limit on the number of annotations or labels a resource can have, though annotations appear to be limited to 256Kb. That limit could be a problem for implementers of gateway-api since they'd have to use a validating admission webhook to check the length, so limiting the number of annotations is a handy shortcut that can be managed with openapi validation (that said, I suppose you could still hit the limit with one really long annotation).

I'd support doubling the limits for both annotations and labels to 16 if not higher.

@dudicoco
Copy link

@jscaltreto what implementation are you using?
FYI Istio and Envoy Gateway have alternative ways of setting annotations which provides a workaround for this issue. Other implementations might also have an alternative way of setting annotations.

@jscaltreto
Copy link

Thanks for the tip, @dudicoco! I'm using Cilium. I'll take a closer look and see if there are any alternative workarounds.

Even if workarounds exist, I still think this is something that should be possible in the gateway-api spec. At the very least, the limits should be documented.

@howardjohn
Copy link
Contributor

howardjohn commented Feb 17, 2025 via email

@youngnick
Copy link
Contributor

We can raise the limits, but every time we do that, we make it more likely that folks will start hitting the etcd storage limits for their objects (1Mbyte by default). When you do that, the object won't be written to etcd, or persisted to Kubernetes. This is particularly a concern for annotations, since they can contain arbitrary data, as @jscaltreto says, up to 256kb per annotation. So it will be pretty easy to hit if you use annotations extensively.

On top of that, a large part of the point of Gateway API is to try to move people away from using annotations, sigh. I know, folks asking for this are users who just want their things to work, I understand the necessity, but I really wish that we didn't need it.

I think I could agree that doubling to 16 is acceptable. @robscott, @shaneutt, @mlavacca, any thoughts?

@howardjohn
Copy link
Contributor

howardjohn commented Feb 18, 2025

It's not like it's a silent failure - the experience is the same for a user if etcd size limit is hit or the cel limit is hit. We just artificially make it (way) lower

On top of that, a large part of the point of Gateway API is to try to move people away from using annotations, sigh

not all usage of labels/annotations is "break glass a missing feature" - there a variety of usage beyond that (especially with labels) for categorization, identification, etc.

admittedly a lot is though

@mlavacca
Copy link
Member

On top of that, a large part of the point of Gateway API is to try to move people away from using annotations, sigh. I know, folks asking for this are users who just want their things to work, I understand the necessity, but I really wish that we didn't need it.

Completely agree with this point. I understand users' point as well, but I think that keeping the amount of allowed annotations low helps us in one of the main Gateway API's objectives, i.e., moving away from "configuration through annotation". Said that, I'm fine with increasing the limit to 16. It's reasonably high, yet not too high.

@shaneutt
Copy link
Member

I'm seeing two comments here that mention alternative solutions. Could someone please provide a deeper summary of those alternatives here?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

9 participants