InvalidImageName error could be moved to API Validation. #115736

kannon92 · 2023-02-13T17:35:51Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Previously InvalidImageName is possible if your image has all capitals, fails docker regular expression checks. This moves these checks to the API when creating the pod. Avoids creating this resource.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Add API Validation for container image names.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

kubernetes/enhancements#3816

kannon92 · 2023-02-13T17:36:31Z

/sig node

thockin

Can you also include the removal of the kubelet code in the same PR, so we can see it is 1-for-1 ?

pkg/apis/core/validation/validation.go

kannon92 · 2023-02-13T18:28:11Z

Can you also include the removal of the kubelet code in the same PR, so we can see it is 1-for-1 ?

I don't think we can remove it in kubelet. I'll paste the kubelet code here. I think there is a bit more in the Kubelet code than just doing validation:

	// If the image contains no tag or digest, a default tag should be applied.
	image, err := applyDefaultImageTag(container.Image)
	if err != nil {
		msg := fmt.Sprintf("Failed to apply default image tag %q: %v", container.Image, err)
		m.logIt(ref, v1.EventTypeWarning, events.FailedToInspectImage, logPrefix, msg, klog.Warning)
		return "", msg, ErrInvalidImageName
	}

// applyDefaultImageTag parses a docker image string, if it doesn't contain any tag or digest,
// a default tag will be applied.
func applyDefaultImageTag(image string) (string, error) {
	named, err := dockerref.ParseNormalizedNamed(image)
	if err != nil {
		return "", fmt.Errorf("couldn't parse image reference %q: %v", image, err)
	}
	_, isTagged := named.(dockerref.Tagged)
	_, isDigested := named.(dockerref.Digested)
	if !isTagged && !isDigested {
		// we just concatenate the image name with the default tag here instead
		// of using dockerref.WithTag(named, ...) because that would cause the
		// image to be fully qualified as docker.io/$name if it's a short name
		// (e.g. just busybox). We don't want that to happen to keep the CRI
		// agnostic wrt image names and default hostnames.
		image = image + ":latest"
	}
	return image, nil
}

pkg/apis/core/validation/validation.go

pkg/apis/core/validation/validation_test.go

bart0sh · 2023-02-17T12:14:02Z

pkg/apis/core/validation/validation.go

+				allErrs = append(allErrs, field.Invalid(path.Child("image"), ctr.Image, "repository name must be all lowercase"))
+			} else if strings.Contains(err.Error(), "hexadecimal strings") {
+				allErrs = append(allErrs, field.Invalid(path.Child("image"), ctr.Image, "repository name must not specify 64-byte hexadecimal strings"))
+			}


Do we ignore other errors in purpose? If so, why?

Yea some of just so I can make sure all unit tests pass.

@derekwaynecarr had a comment in this code about padded whitespace being a case that we shouldn't check:

// TODO: do not validate leading and trailing whitespace to preserve backward compatibility. // for example: https://github.com/openshift/origin/issues/14659 image = " " is special token in pod template // others may have done similar

I found that our unit tests allow empty image names so I don't check for that and I also avoid this check if we have padded whitespace.

It is interesting that we actually check and fail this for Ephermeral Containers but not for normal containers.

I pushed a change to spell out more cases and report user friendly errors:

The following cases are test cases I added:

capital letters in the image name.

sha256 in the image name

http or https in the image name

and general error that fails the docker check

I'm not sure it makes sense to double check errors returned by ParseNormalizedNamed in geeneral.
What would happen if ParseNormalizedNamed error messages are changed or new checks are added?

I'm not sure of the best approach here.

#115736 (comment) was one suggestion to better align these validation messages with the current approach.

In general I think we would want more descriptive error messages if this validation fails. The ParseNormalizedNamed will return invalid reference format for all these cases. We double check errors so we could give a more descriptive error in the validation message.

From a user I usually find descriptive errors to be much more helpful than a general one.

I do have unit tests around this function so that we can make sure that we translate cases to error messages

ParseNormalizedNamed will return invalid reference format for all these cases.

It will return "invalid reference format: repository name must be lowercase" or "invalid repository name (%s), cannot specify 64-byte hexadecimal strings" or "reference %s has no name", which are quite descriptive from my point of view. If you find them less descriptive than needed I'd suggest to submit a PR for github.com/docker/distribution/digestset.

If new checks are added to theParseNormalizedNamed API we wouldn't notice it here and consider them as "image is an invalid reference type". This is not what we want, I believe.

#115736 (comment) was one suggestion to better align these validation messages with the current approach

Would it help to include error message returned by the ParseNormalizedNamed into the Kubernetes validation error message?

@thockin and you have very similar thoughts!

#115736 (comment)

I'm not as sure about this code. Is it appropriate to apply a default tag in the validation logic?

I could probably move this utility to validation.go and maybe remove it from kubelet image_manager but not sure what problems that would cause?

I was actually thinking about this code: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/kuberuntime_image.go#L35
Kubeadm may also have the code that pulls images. Other k/k components can also have something similar.

I'm still pretty new to Kubernetes so I would need help on where to look. Where is the code for Kubeadm?

I found out that ParseImageName is used in most places and we only use a special case in image_manager because we don't want the docker url. So the author uses ParseNormalizedNames so they can ignore the fact that ParseImageName returns the full docker repo name.

I updated the PR.

This raises an interesting point!

Kubelet can create Pods from static files on disk, so it needs to do its own validation, but I don't honestly know if it is better to create such a pod with an error condition (which is very visible to the user) or to fail the creation, which will be buried in the kubelet logs.

Perhaps this is why InvalidImageName is implemented as it is today? I don't know the history of that, and we should be careful not to break a UX that was, maybe, intentional.

@derekwaynecarr or @dchen1107 or @smarterclayton may recall the history?

Otherwise we should do some archaeology (git blame + git log) to see if there were discussions about this.

I hope I didn't send you on a wild goose chase, but non-zero chance...

Maybe I should put this PR up as a topic of discussion in sig-node.

bart0sh · 2023-02-17T12:18:01Z

/triage accepted
/priority important-longterm
/assign

pkg/kubelet/images/image_manager.go

pkg/apis/core/validation/validation.go

pkg/kubelet/images/image_manager.go

kannon92 · 2023-02-22T15:57:16Z

/retest

bart0sh · 2023-02-23T08:34:53Z

pkg/kubelet/images/image_manager_test.go

@@ -209,7 +239,9 @@ func TestParallelPuller(t *testing.T) {
 				fakeRuntime.CalledFunctions = nil
 				fakeClock.Step(time.Second)
 				_, _, err := puller.EnsureImageExists(ctx, pod, container, nil, nil)
-				fakeRuntime.AssertCalls(expected.calls)
+				if len(expected.calls) > 0 {


@kannon92 This looks like we're ignoring calls if we're not expecting them. Why?

I really don't understand how we do mocking of functions here.

I found that all the calls use GetImageRef and I notice that is a function in the ImageService. I'm not sure if I have to refactor this and add functions to the TestRuntime in order to properly mock them.

So I found that if I don't put an expect call I can still assert against errors that aren't present in the FakeRuntime.

func (f *FakeRuntime) GetImageRef(_ context.Context, image kubecontainer.ImageSpec) (string, error) { f.Lock() defer f.Unlock() f.CalledFunctions = append(f.CalledFunctions, "GetImageRef") for _, i := range f.ImageList { if i.ID == image.Image { return i.ID, nil } } return "", f.InspectErr }

it looks like a bug in the FakeRuntime init code. You can try to apply this patch to your PR branch. It should fix panics and reveal test failures:

diff --git a/pkg/kubelet/images/image_manager_test.go b/pkg/kubelet/images/image_manager_test.go index 64f3e1d9738..faf3062d3ec 100644 --- a/pkg/kubelet/images/image_manager_test.go +++ b/pkg/kubelet/images/image_manager_test.go @@ -196,7 +196,7 @@ func (m *mockPodPullingTimeRecorder) RecordImageStartedPulling(podUID types.UID) func (m *mockPodPullingTimeRecorder) RecordImageFinishedPulling(podUID types.UID) {} -func pullerTestEnv(c pullerTestCase, serialized bool) (puller ImageManager, fakeClock *testingclock.FakeClock, fakeRuntime *ctest.FakeRuntime, container *v1.Container) { +func pullerTestEnv(t *testing.T, c pullerTestCase, serialized bool) (puller ImageManager, fakeClock *testingclock.FakeClock, fakeRuntime *ctest.FakeRuntime, container *v1.Container) { container = &v1.Container{ Name: "container_name", Image: c.containerImage, @@ -207,7 +207,7 @@ func pullerTestEnv(c pullerTestCase, serialized bool) (puller ImageManager, fake fakeClock = testingclock.NewFakeClock(time.Now()) backOff.Clock = fakeClock - fakeRuntime = &ctest.FakeRuntime{} + fakeRuntime = &ctest.FakeRuntime{T: t} fakeRecorder := &record.FakeRecorder{} fakeRuntime.ImageList = []Image{{ID: "present_image:latest"}} @@ -231,17 +231,15 @@ func TestParallelPuller(t *testing.T) { useSerializedEnv := false for _, c := range cases { - puller, fakeClock, fakeRuntime, container := pullerTestEnv(c, useSerializedEnv) + puller, fakeClock, fakeRuntime, container := pullerTestEnv(t, c, useSerializedEnv) t.Run(c.testName, func(t *testing.T) { ctx := context.Background() for _, expected := range c.expected { - fakeRuntime.CalledFunctions = nil + fakeRuntime.ClearCalls() fakeClock.Step(time.Second) _, _, err := puller.EnsureImageExists(ctx, pod, container, nil, nil) - if len(expected.calls) > 0 { - fakeRuntime.AssertCalls(expected.calls) - } + fakeRuntime.AssertCalls(expected.calls) assert.Equal(t, expected.err, err) } }) @@ -261,7 +259,7 @@ func TestSerializedPuller(t *testing.T) { useSerializedEnv := true for _, c := range cases { - puller, fakeClock, fakeRuntime, container := pullerTestEnv(c, useSerializedEnv) + puller, fakeClock, fakeRuntime, container := pullerTestEnv(t, c, useSerializedEnv) t.Run(c.testName, func(t *testing.T) { ctx := context.Background() @@ -322,7 +320,7 @@ func TestPullAndListImageWithPodAnnotations(t *testing.T) { }} useSerializedEnv := true - puller, fakeClock, fakeRuntime, container := pullerTestEnv(c, useSerializedEnv) + puller, fakeClock, fakeRuntime, container := pullerTestEnv(t, c, useSerializedEnv) fakeRuntime.CalledFunctions = nil fakeRuntime.ImageList = []Image{} fakeClock.Step(time.Second)

Thank you! I’ll apply that and test.

I decided to move some of this into a separate PR: #116231

This was mostly so we can discuss the feature separate from some test fixes and additions.

pkg/util/parsers/parsers_test.go

alculquicondor · 2023-02-28T15:23:02Z

pkg/apis/core/validation/validation.go

@@ -3192,6 +3194,17 @@ func validateContainerCommon(ctr *core.Container, volumes map[string]core.Volume
 		allErrs = append(allErrs, field.Required(path.Child("image"), ""))


Doesn't this function also run on the pod templates when creating Jobs, Deployments, etc?

Is it acceptable to also fail the creation of those, or just Pods themselves?

If the pod was never going to be viable, it seems better to fail early? On the other hand, if someone was using this as a template (e.g. doing custom admission control as the pod is created to change the template value into a real value) then it would break them.

Do we have any evidence or anecdata?

kannon92 · 2023-02-28T22:09:22Z

/hold I am open for people to review but I think maybe I should take this to sig-node and discuss.

pkg/kubelet/images/image_manager_test.go

kannon92 · 2023-03-02T21:20:59Z

/retest

dchen1107 · 2023-03-07T18:11:37Z

/assign @dchen1107

kannon92 · 2023-03-07T18:14:38Z

/retest

kannon92 · 2023-03-07T18:19:54Z

I decided to move a lot of the test increases and utility cleanup to #116231.

k8s-ci-robot · 2023-03-17T05:36:29Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bart0sh · 2023-04-18T12:31:36Z

@kannon92 If you still need this PR then please rebase, if not, please close the PR

smarterclayton · 2023-04-18T20:03:24Z

From a backwards compatibility perspective, image name was intended to be passed through the system and interpreted by the CRI / container runtime directly, without being parsed. Very early on we choose not to parse it so that evolutions in the image pull spec (in Docker or OCI) could occur safely, and it was intended to be an opaque value from the Kube perspective.

I don't know if anyone has added parsing outside of the Kubelet, but I would probably be against tightening validation on the apiserver because we have no way of knowing the full use of it (people may use invalid image names as placeholders and use webhooks to resolve them at the last minute), we generally don't tighten validation on fields for forwards compatibility, and it would limit our ability to support more diverse runtime types in the future.

I do think the Kubelet should return a very clear error Reason via pod conditions to indicate that the image value was invalid, and we should define that in CRI at a minimum (CRI should return a well known error when the image spec is invalid).

The last time we discussed this was when we added image digest to the pod status,

#7203 (comment) is one of the earlier places we discussed it, #1697 describes the core issue.

kannon92 · 2023-04-24T16:44:09Z

From a backwards compatibility perspective, image name was intended to be passed through the system and interpreted by the CRI / container runtime directly, without being parsed. Very early on we choose not to parse it so that evolutions in the image pull spec (in Docker or OCI) could occur safely, and it was intended to be an opaque value from the Kube perspective.

I don't know if anyone has added parsing outside of the Kubelet, but I would probably be against tightening validation on the apiserver because we have no way of knowing the full use of it (people may use invalid image names as placeholders and use webhooks to resolve them at the last minute), we generally don't tighten validation on fields for forwards compatibility, and it would limit our ability to support more diverse runtime types in the future.

I do think the Kubelet should return a very clear error Reason via pod conditions to indicate that the image value was invalid, and we should define that in CRI at a minimum (CRI should return a well known error when the image spec is invalid).

The last time we discussed this was when we added image digest to the pod status,

#7203 (comment) is one of the earlier places we discussed it, #1697 describes the core issue.

Thank you for your explanation. I will goahead and close this PR but I appreciate your feedback.

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 13, 2023

k8s-ci-robot requested review from caesarxuchao and deads2k February 13, 2023 17:37

kannon92 mentioned this pull request Feb 13, 2023

KEP-3815 : Add Condition for Pending Pods that are stuck due to configuration errors kubernetes/enhancements#3816

Closed

thockin reviewed Feb 13, 2023

View reviewed changes

pkg/apis/core/validation/validation.go Outdated Show resolved Hide resolved

kannon92 force-pushed the validate-docker-image branch from 01f8a40 to e44fe3c Compare February 13, 2023 19:01

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 13, 2023

kannon92 commented Feb 13, 2023

View reviewed changes

pkg/apis/core/validation/validation.go Outdated Show resolved Hide resolved

kannon92 commented Feb 13, 2023

View reviewed changes

pkg/apis/core/validation/validation_test.go Outdated Show resolved Hide resolved

kannon92 force-pushed the validate-docker-image branch from e44fe3c to 8f21b97 Compare February 13, 2023 19:45

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 13, 2023

kannon92 force-pushed the validate-docker-image branch from 8f21b97 to 9a19a9f Compare February 13, 2023 19:49

kannon92 changed the title ~~WIP: InvalidImageName error could be moved to API Validation.~~ InvalidImageName error could be moved to API Validation. Feb 13, 2023

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 13, 2023

bart0sh reviewed Feb 17, 2023

View reviewed changes

kannon92 force-pushed the validate-docker-image branch from 04e27df to 7a90200 Compare February 21, 2023 14:32

bart0sh reviewed Feb 21, 2023

View reviewed changes

pkg/kubelet/images/image_manager.go Outdated Show resolved Hide resolved

kannon92 force-pushed the validate-docker-image branch from 7a90200 to 3e27cea Compare February 21, 2023 18:54

bart0sh reviewed Feb 22, 2023

View reviewed changes

pkg/apis/core/validation/validation.go Outdated Show resolved Hide resolved

bart0sh reviewed Feb 22, 2023

View reviewed changes

pkg/apis/core/validation/validation.go Show resolved Hide resolved

bart0sh reviewed Feb 22, 2023

View reviewed changes

pkg/kubelet/images/image_manager.go Show resolved Hide resolved

kannon92 force-pushed the validate-docker-image branch from 3e27cea to b388b8b Compare February 22, 2023 15:09

bart0sh reviewed Feb 23, 2023

View reviewed changes

bart0sh reviewed Feb 28, 2023

View reviewed changes

pkg/util/parsers/parsers_test.go Show resolved Hide resolved

kannon92 force-pushed the validate-docker-image branch from b388b8b to 74033ca Compare February 28, 2023 13:50

alculquicondor reviewed Feb 28, 2023

View reviewed changes

kannon92 force-pushed the validate-docker-image branch from 74033ca to 6c68cef Compare February 28, 2023 20:16

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 28, 2023

bart0sh reviewed Mar 1, 2023

View reviewed changes

pkg/kubelet/images/image_manager_test.go Show resolved Hide resolved

kannon92 mentioned this pull request Mar 2, 2023

Using parsers in applyDefaultImageTag and adding error test cases. #116231

Merged

validate docker name

69f9aa7

kannon92 force-pushed the validate-docker-image branch from 0afbe55 to 69f9aa7 Compare March 2, 2023 20:41

k8s-ci-robot assigned dchen1107 Mar 7, 2023

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2023

kannon92 closed this Apr 24, 2023

		@@ -3192,6 +3194,17 @@ func validateContainerCommon(ctr *core.Container, volumes map[string]core.Volume
		allErrs = append(allErrs, field.Required(path.Child("image"), ""))

InvalidImageName error could be moved to API Validation. #115736

InvalidImageName error could be moved to API Validation. #115736

Uh oh!

Conversation

kannon92 commented Feb 13, 2023

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

kannon92 commented Feb 13, 2023

Uh oh!

thockin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kannon92 commented Feb 13, 2023

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bart0sh Feb 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bart0sh Feb 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bart0sh commented Feb 17, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kannon92 commented Feb 22, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kannon92 commented Feb 28, 2023

Uh oh!

Uh oh!

kannon92 commented Mar 2, 2023

Uh oh!

dchen1107 commented Mar 7, 2023

Uh oh!

kannon92 commented Mar 7, 2023

Uh oh!

bart0sh Feb 19, 2023 •

edited

Loading

bart0sh Feb 20, 2023 •

edited

Loading

smarterclayton commented Apr 18, 2023 •

edited

Loading