Skip to content

KEP-3610: namespace-wide global env injection #3612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

pacoxu
Copy link
Member

@pacoxu pacoxu commented Oct 12, 2022

  • One-line PR description:
    namespace wide env inject into every container

Closed due to #3612 (comment).

eventually something like mutating CEL admission to do this injection

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 12, 2022
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Oct 12, 2022
@k8s-ci-robot k8s-ci-robot requested a review from lavalamp October 12, 2022 06:48
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 12, 2022
@pacoxu
Copy link
Member Author

pacoxu commented Oct 12, 2022

/sig auth

@pacoxu pacoxu force-pushed the 3610-env-injection branch from 81f8475 to dcdc63d Compare October 13, 2022 09:07
@pacoxu
Copy link
Member Author

pacoxu commented Oct 13, 2022

/cc @deads2k @fedebongio

for some initial comments

  • whether we should add a new admission controller. (There is no new admission controller for years.)
  • annotation or configmap or other proposals
  • namespace-wide or cluster-wide

@k8s-ci-robot k8s-ci-robot requested a review from deads2k October 13, 2022 09:19
Copy link

@harshanarayana harshanarayana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left a few comment on this based on my prior experience of writing a similar admission controller into the k8s that we managed in house at work. Hope that is all right.

**annotation vs configmap**
Annotation is simpler, So I choose to use an annotation here.

- For annotation, we should divide the string by `,` and get the key and value by `=`, for instance, "key1=value1,key2=value2".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an admission controller added to the intree api server on the kubernetes that we manage in house and this format of , and = for separation got too complicated too soon with all the patterns and ability to patch them on the go easily. Instead we decided to switch to using env-to-inject/<env-name>: <env-value> for the annotation. And the values were base64 encoded always to be able to inject complicated patterns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems better if we choose to use annotation.
Or we can use configmap for complicated cases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see the annotation over the config map as there will be a pre-built assumption that it will have a specific name. Not that it is a bad thing. But, that is just my feeling,


- A configmap named `global.env`. Then we can use the configmap key as env key, and value as env value.

**cluster wide VS namespace wide**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did the cluster wide by adding the Admission config file that provides a list of cluster wide env that gets injected into all pods across the namespace with a annotation that would let one opt out of getting env injected into the pod if required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this design, there would be a new admission config yaml for this feature and apiserver should load it. Is there any admission controller that uses this pattern? I need to check.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PodPreset used to be able to do this. If I remember correctly. Where it had its own external config file that you could use ENV values specified right ?


Consider including folks who also work outside the SIG or subproject.
-->
As there is no validation for the key value in annotation or configmap, users will get errors util they create a new pod.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users will get errors ut

I suppose you meant users will not get errors?

- skip the env if container already has set

### Non-Goals

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to include that this KEP doesn't deal with env that one might want to inject via the valueFrom pattern?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If we want to support that, we need some further design.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having something like that will be very much useful. Especially since you can enable encryption at rest for sections and injecting the env as valueFrom makes it much safer than actual env in the Pod Spec. Isn't it ?

1. If a pod has already set the env, skip it. Overwriting is too aggressive.
If users want to overwrite the env, `global.env/overwrite: true` should be set in namespace annotation.

2. Env injection only for pod creation. Not for update(The behavior that a pod update may change its env, is odd).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case if the global.env values are updated at the namespace level then how would this be pushed down to the POD level ? From the PR tagged to this KEP I don't see that. Might be good to have that in the non-goals ?

Copy link
Member Author

@pacoxu pacoxu Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for pod level, do you mean adding env to every container and init container of the pod?
I mean that env is container-level attribute.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say I create the namespace with ENV x=y,a=b. Over a period of time, I change x=z,a=b How will this value be sent to the pod since the KEP says we won't support the ENV change in POD update case, which is perfectly valid. So, someone will have to delete all the pods in suitable namespace and let them be re-created with new ENV ? case in point is something like proxy with password where the password might changes over a period of time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this depens on Should we overwriting existed env? Or do need we to add a flag

Pod is always created by a controller like deployment controller or job controller. If the deployment is without the env, new pods that are created from the deployment would have new env from new injection settings. I think overwriting is too aggressive by default. However, this may be a use case and users may want to overwrite for some reason.

If overwrite is an always behavior, pod update can be supported. Hence, pod env can be overwritten to the latest settings once updated.

Copy link

@harshanarayana harshanarayana Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. That begs the question, if we are to do these kind of injection, won't it be better off done at the Deployment/STS/Job/CronJob level than the eventual Pod level ? Since trying to patch a running pod on a namespace to inject or update an env will lead to an error error: failed to patch env update to pod template:....* anyway. One has to delete the pod and let it recreate with new Env. But if this has to be dealt with at the pod level then we kind of end up bypassing all the good bits provided by deployments/sts and such in terms of their rolling update nature.

I am partial to saying, if the global env on the namespace is updated, then users will have to explicitly perform a rollout of the resources in that namespace so that the pods of the said resources can inherit the new env ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overwriting is too aggressive by default.

This I agree on. The default should always be ignoring the overwrite. If the env exists, it should have been added as a conscious decision on the end user part.

But what I was mainly concerned about was the change of env that was injected to the Pod spec by the admission controller. That also means, we would have to have a way to distinguish that if the admission controller did add the ENV or it opted not to because there was an override at the Pod spec level already. We have done something similar by just adding an annotation saying if the ENV was indeed injected by the Admission controller or not. If it was not injected by the admission controller, all possible further actions will ignore such pods honouring the default override nature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I agree on. The default should always be ignoring the overwrite. If the env exists, it should have been added as a conscious decision on the end user part.

I changed to using API Object instead of annotation. I followed the LimitRange design and the default will be applied if it is not set. If we want to overwrite, I add another forceEnv like hard(default like soft).

### Other Proposals and Concerns

**annotation vs configmap**
Annotation is simpler, So I choose to use an annotation here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a feature that developers might want to leverage, and not something that should have restricted use. Putting this in a namespace annotation requires they have update access on the namespace, but we consider that an elevated permission (e.g. it gives control over pod-security & network policy labels).

It seems like a first-class API would be a better fit for something like this? Using an annotation or configmap feels like a hack to get around the extra work required for an API. A full API object raises the bar, but that might not be a bad thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An API object also addresses the validation concerns, and makes it easier to support the ValueFrom use case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Like LimitRange or ResourceQuota. It would be better.

I updated the KEP with API object design.

[experience reports]: https://github.com/golang/go/wiki/ExperienceReports
-->

This seems to be a common requirement.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a much more fleshed out motivation for this. Specifically, I don't think this has adequately answered the question of why this should be a built in controller, and not a webhook (or eventually CEL admission policy)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean another proposal like below?

  • User can use CEL admission fro a new CRD
  • And using a webhook to add env according to the new CRD

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use CEL admission policy to validate if an env value is meeting the requirement.

https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/

object.envars.filter(e, e.name == 'MY_ENV').all(e, e.value.matches('^[a-zA-Z]*) Validate the 'value' field of a listMap entry where key field 'name' is 'MY_ENV'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me how this is different in nature from the PodPreset feature that never progressed past alpha and was removed.

Using admission webhooks for now and eventually something like mutating CEL admission to do this injection seems better at first glance than crafting a built-in API with this narrow of a use case.

Copy link
Member Author

@pacoxu pacoxu Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using admission webhooks for now and eventually something like mutating CEL admission to do this injection seems better at first glance than crafting a built-in API with this narrow of a use case.

@liggitt thanks for your information here and at the sig-auth meeting.
I have searched #2876 and it is in the future plan part.

CEL might be used in Kubernetes for extensibility beyond CRD validation. The future plans section of this KEP explains how CEL might be used for general admission control, defaulting, and conversion. This KEP aims to prove the utility of CEL for both the immediate use case (CRD validation) and these future use cases. This KEP also aims to use CEL in a way that is congruent with these future use cases.

Is there any process for mutating CEL admission? The use case here is one of the defaulting parts IMO.

/close

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any process for mutating CEL admission? The use case here is one of the defaulting parts IMO.

Not yet, but it's a goal... use/integration of CEL is being worked on incrementally to help us gain more experience with integration CEL, with the following items progressing/completing ahead of mutating capabilities:

cc @jpbetz @cici37 @tallclair

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @andrewsykim who is exploring CEL for mutating admission

@pacoxu pacoxu force-pushed the 3610-env-injection branch from 00da57d to 7d1cacd Compare October 19, 2022 12:54
Consider including folks who also work outside the SIG or subproject.
-->
For initial design with annotations, as there is no validation for the key value in annotation or configmap, users will get errors util they create a new pod.
For API Object, we can validate the key value in the API Object.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This may have a risk to overwrite system env like KUBERNETES_PORT. Should there be a disallowed list?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow env such as KUBERNETES_SERVICE_HOST to be overwritten, then should we not do the same for KUBERNETES_PORT also? But +1 for having a disallow list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/kubernetes/kubernetes/blob/3d67e162a03d0d724dc5a15a0617c5e8572c7b4a/cmd/kubelet/app/options/options.go#L488

AllowedUnsafeSysctls may be something that we can follow. We can mark "KUBERNETES_*" as disallowed.

Copy link

@harshanarayana harshanarayana Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what would be the behavior if this is set ?

  1. Say the new argument that allows certain unsafe env is not set to KUBERNETES_*
  2. Someone tries to create a GlobalEnv resource with KUBERNETS_SOMETHING: somevalue as the env injection.

Do we error out and indicate that you can't even create the GlobalEnv resource until you cleanup the unsafe env or we accept the GlobalEnv but ignore the unsafe one during injection ?

kep-number: 3610
authors:
- "@pacoxu"
owning-sig: sig-api-machinery
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something related to injecting env into pods seems like it is owned by sig-node more than api-machinery (api-machinery could be participating since it might be involved in the mechanism)

@pacoxu pacoxu force-pushed the 3610-env-injection branch from 7d1cacd to d3c8c71 Compare January 5, 2023 03:42
@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jan 5, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pacoxu
Once this PR has been reviewed and has the lgtm label, please assign johnbelamaric for approval by writing /assign @johnbelamaric in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@liggitt liggitt removed the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Jan 5, 2023
We support both `defaultEnv` and `forceEnv` in the spec.

- `defaultEnv` will be added to the container if the env is not set in the container.
- `forceEnv` will override the env in the container.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an option to suggest ignoring injection of this at the pod level ? Certain usecases like proxy, for example.

If you have a pod needs proxy because it is reaching out to the cluster, but other pods don't need them because they are cluster internal. Injecting the proxy to all pods can cause problems if one misconfigures the entry such as no_proxy that can cause the cluster breakdown. So, would it be good to have a way to annotate a pod and say I am not interested in any of the env injection ? Something global.env/ignore: true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another use case. Should I add it to the stories below?

For ValidatingAdmissionPolicy, they use binding mode to add a new binding with matchResources:matchLabels.
We may add a matchLabels field in GlobalEnv. By default, no matchLabels means all.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another use case. Should I add it to the stories below?

Would be good to have I think. Having a way to exclude injection can be equally important as having a way to inject something.

By default, no matchLabels means all.

Sure. This will also be inline with how rest of the things do the match. So definitely better than what I mentioned above via the annotation.

know that this has succeeded?
-->
- if the env is not set in container, inject env to all containers in namespace
- if it is already set in container, override some env and not override for others

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one updated the custom resource GlobalEnv on a cluster with custom values in a given namespace, will that forcefully inject the updated values to the pods on the cluster again or one has to delete the pod manually and force the injection of these env? If the latter, then should we have that under non-goal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ppd updates may not change fields other than spec.containers[*].image, spec.initContainers[*].image, spec.activeDeadlineSeconds, spec.tolerations

For most other plugins, the injection only happens once the pod is created. We can add this to non-goal.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ppd updates may not change fields other than spec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations

Agreed. The only way to update the envs would be to trigger a delete of the pod. Which is definitely not the right thing to do. A line under non-goal would be good enough I think. Thanks

@pacoxu pacoxu force-pushed the 3610-env-injection branch from d3c8c71 to bb12a9d Compare January 6, 2023 06:42
@k8s-ci-robot
Copy link
Contributor

@pacoxu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-enhancements-test bb12a9d link true /test pull-enhancements-test
pull-enhancements-verify bb12a9d link true /test pull-enhancements-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

- deads2k
- fedebongio
- lavalamp
prr-approvers:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is no more a valid field in the kep.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref: #3569

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review.

But according to #3612 (comment), this feature is hard to go to GA like PodPreset. I'd like to close this PR.

@k8s-ci-robot
Copy link
Contributor

@pacoxu: Closed this PR.

In response to this:

Using admission webhooks for now and eventually something like mutating CEL admission to do this injection seems better at first glance than crafting a built-in API with this narrow of a use case.

@LiGgit thanks for your information here and at sig-auth meeting.
I have searched #2876 and it is in the future plan part.

CEL might be used in Kubernetes for extensibility beyond CRD validation. The future plans section of this KEP explains how CEL might be used for general admission control, defaulting and conversion. This KEP aims to prove the utility of CEL for both the immediate use case (CRD validation) and these future use cases. This KEP also aims to use CEL in a way that is congruent with these future use cases.

Is there any process for mutating CEL admission? The use case here is one of the defaulting part IMO.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Development

Successfully merging this pull request may close these issues.

6 participants