Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CRI] Containerd switching to HTTPS for TLS verification for a HTTP Image Registry on Localhost #4826

Open
tejasrao97 opened this issue Dec 10, 2020 · 24 comments
Assignees

Comments

@tejasrao97
Copy link

I tried creating a cluster using kubeadm and contianerd as the CRI, came across a new error!
conatainerd version which I have used is 1.4.3 and the kubeadm version is 1.19.4
I'm getting certificate error from the containerd runtime when the images that are hosted on gitlab tool running as a pod are being pulled using localhost(127.0.0.1) as the registry domain, the request is going through http at first, but later the localhost is getting resolved to the loadbalancer IP on which the gitlab pod is exposed and its trying to validate the certificate as the certificates are not added to the loadbalancer.

Steps to reproduce the issue:
1.Create a kubernetes cluster using kubeadm(version 1.19.4) and containerd (1.4.3).
2.Run gitlab pod in the cluster and try pushing images to the container registry hosted on gitlab using skopeo
3.Create a sample deployment with the image pushed to the gitlab container registry

Describe the results you received:
RESULTS RECEIVED:--->
image
image

Containerd fails to pull the image present in gitlab due to certificate error for the gitlab endpoint(load balancer).
Gitlab internally authenticates the user with registry credentials, without tls verification, but containerd CRI is tracing the authentication redirection with tls enabled because of which the image is not getting pulled due to certificate issue.

Describe the results you expected:
RESULTS EXPECTED:->>
image

Docker and CRI-O CRI's are able to pull the image from the similar setup as mentioned above, it would be helpfull if containerd has a similar functionality.

Output of containerd --version:

Client:
  Version:  v1.4.3
  Revision: 269548fa27e0089a8b8278fc4fc781d7f65a939b
  Go version: go1.15.5

Server:
  Version:  1.4.3
  Revision: 269548fa27e0089a8b8278fc4fc781d7f65a939b
  UUID: db199dd7-5b26-4846-bafb-6d90f2a4e4e4

Any other relevant information:

@fuweid
Copy link
Member

fuweid commented Dec 10, 2020

@tejasrao97 Could you try to use ctr image pull --skip-verify --plain-http -u to try it? And for CRI-plugin, you can also check the doc https://github.com/containerd/containerd/blob/master/docs/cri/registry.md. Hope it can help.

# explicitly use v2 config format
version = 2

[plugins."io.containerd.grpc.v1.cri".registry.configs."my.custom.registry".tls]
  insecure_skip_verify = true

@tejasrao97
Copy link
Author

@fuweid but, when we are using managed Kubernetes services we won't have access to add any fields into the config.toml of containerd, by doing an ssh into the nodes. We are not facing this issue in managed Kuberentes clusters which are using CRI-O and Docker as the container runtime interface and we haven't added any entry in the config files for CRI-O/Docker for the insecure registry that is hosted inside the cluster.

@puneethpk
Copy link

Even i am facing the same issue in one of the kubernetes clusters provisioned with container runtime as containerd. We need to find a fix for this ASAP. All these days we had this setup in docker and weren't facing any issues as such.

@fuweid
Copy link
Member

fuweid commented Dec 10, 2020

link to containerd/cri#1328

@tejasrao97
Copy link
Author

@fuweid in 1.4.3 version of conatinerd, as of now if the cri is using http while pulling an image from localhost, but after that it's trying to verify the certificates. So if we skip the tls verification by default, when the image is being pulled by the CRI from localhost then I guess that would fix the issue.

@fuweid
Copy link
Member

fuweid commented Dec 10, 2020

@tejasrao97 We can't skip the tls verification by default for security reason. But I think it sounds good to skip for localhost.

@fuweid fuweid changed the title Containerd switching to HTTPS for TLS verification for a HTTP Registry [CRI] Containerd switching to HTTPS for TLS verification for a HTTP Registry Dec 10, 2020
@fuweid fuweid added area/cri Container Runtime Interface (CRI) kind/enhancement labels Dec 10, 2020
@tejasrao97
Copy link
Author

tejasrao97 commented Dec 10, 2020

@fuweid yeah it would be good if we skip it only for localhost. Thank you! 🙂

@mikebrow
Copy link
Member

mikebrow commented Dec 10, 2020

I wonder if it would make sense to have a listing of ips for which tls verificaiton will be skipped.. and default that to localhost ipv4/6. Hmm.

@puneethpk
Copy link

Any Updates on this ?

@tejasrao97
Copy link
Author

Hi @mikebrow @fuweid,
Can you please let us know if this issue would be fixed in the upcoming release of contianerd.

@fuweid fuweid added this to the 1.5 milestone Feb 23, 2021
@fuweid
Copy link
Member

fuweid commented Feb 23, 2021

Hi @mikebrow @fuweid,
Can you please let us know if this issue would be fixed in the upcoming release of contianerd.

I am not sure that I have time to work on this issue. Just mark it in 1.5 milestone and add help-issue label. :)

@dims
Copy link
Member

dims commented Feb 26, 2021

cc @adisky

@adisky
Copy link
Contributor

adisky commented Mar 1, 2021

/assign

@adisky
Copy link
Contributor

adisky commented Mar 1, 2021

I tried to setup an insecure docker registry and using ctr/crictl to pull the images without any additional config in /etc/containerd/config.toml, I am able to pull images.

@tejasrao97
Copy link
Author

Hi @adisky, yeah we can pull images but we would have to explicitly pass the skip tls verification flag to ctr/crictl to pull the image from an insecure registry.
But, can we skip tls verification for localhost by default without passing any flags, similar to the way where we skip https check by default while pulling images from an insecure registry using localhost.

image

@adisky
Copy link
Contributor

adisky commented Mar 1, 2021

@tejasrao97 I am not passing skip flag. Still able to pull the images, can you check if this works?#5100?
Also if you have access can you share your containerd config?

@tejasrao97
Copy link
Author

@adisky I came across this issue few months ago when I tried to create a Kubernetes cluster using kubeadm(1.19.4) with containerd(1.4.3) as the CRI. I don't have the containerd config because I destroyed that cluster. But I remember not making any changes to the contianerd config file, it had the default values.

@fuweid
Copy link
Member

fuweid commented May 13, 2021

@tejasrao97 try the master branch since #5100 has been merged. If it works for you, please close the ticket. Thanks

@tejasrao97
Copy link
Author

tejasrao97 commented May 13, 2021

@fuweid @adisky @mikebrow @dims I think it's still not working
Below is the screenshot of jenkins console output for a pod-template which is trying to pull an image from gitlab running as pod in the same kubernetes cluster where jenkins is also running as a pod
image

Below is the screenshot of gitlab docker registry in which the image is present
image

Below is the screenshoft of ctr client inside the kubernetes master node failing to pull image from gitlab
image

If I add -k flag to ctr I'm able to pull the images as I had mentioned in the previous comments
image

I built containerd from the master branch and used the resulting containerd binaries to bootstrap the kubernetes cluster using kubeadm. Please find the below screenshots of the binary versions. Thanks

image
image

@fuweid
Copy link
Member

fuweid commented May 14, 2021

@tejasrao97 ctr doesn't work for you, please try crictl, thanks

@tejasrao97
Copy link
Author

tejasrao97 commented May 14, 2021

@fuweid sure I'll try that, but how come it isn't working in Jenkins where ctr is not being used instead a pod template is used.

@tejasrao97
Copy link
Author

@adisky @fuweid

This is the containerd config of the cluster.

version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
plugin_dir = ""
disabled_plugins = []
required_plugins = []
oom_score = 0

[grpc]
address = "/run/containerd/containerd.sock"
tcp_address = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216

[ttrpc]
address = ""
uid = 0
gid = 0

[debug]
address = ""
uid = 0
gid = 0
level = ""

[metrics]
address = ""
grpc_histogram = false

[cgroup]
path = ""

[timeouts]
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"

[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
pause_threshold = 0.02
deletion_threshold = 0
mutation_threshold = 100
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
disable_tcp_service = true
stream_server_address = "127.0.0.1"
stream_server_port = "0"
stream_idle_timeout = "4h0m0s"
enable_selinux = false
selinux_category_range = 1024
sandbox_image = "k8s.gcr.io/pause:3.2"
stats_collect_period = 10
systemd_cgroup = false
enable_tls_streaming = false
max_container_log_line_size = 16384
disable_cgroup = false
disable_apparmor = false
restrict_oom_score_adj = false
max_concurrent_downloads = 3
disable_proc_mount = false
unset_seccomp_profile = ""
tolerate_missing_hugetlb_controller = true
disable_hugetlb_controller = true
ignore_image_defined_volumes = false
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
default_runtime_name = "runc"
no_pivot = false
disable_snapshot_annotations = true
discard_unpacked_layers = false
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
runtime_type = ""
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
runtime_type = ""
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
max_conf_num = 1
conf_template = ""
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
[plugins."io.containerd.grpc.v1.cri".image_decryption]
key_model = ""
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.runtime.v1.linux"]
shim = "containerd-shim"
runtime = "runc"
runtime_root = ""
no_shim = false
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
[plugins."io.containerd.snapshotter.v1.devmapper"]
root_path = ""
pool_name = ""
base_image_size = ""
async_remove = false

I did try with crictl but I'm getting this error message
image

This is the pod template present in jenkins which is trying to pull the image from gitlab docker registry running as a pod in the same cluster where jenkins is also running as a pod.
image

Error Log in Jenkins
image

The same setup is working fine in clusters whose container runtime is either CRI-O and docker.

@tejasrao97 tejasrao97 changed the title [CRI] Containerd switching to HTTPS for TLS verification for a HTTP Registry [CRI] Containerd switching to HTTPS for TLS verification for a HTTP Registry on Localhost May 31, 2021
@tejasrao97 tejasrao97 changed the title [CRI] Containerd switching to HTTPS for TLS verification for a HTTP Registry on Localhost [CRI] Containerd switching to HTTPS for TLS verification for a HTTP Image Registry on Localhost May 31, 2021
@kzys kzys modified the milestones: 1.5, 1.6 Dec 17, 2021
@kzys
Copy link
Member

kzys commented Dec 17, 2021

Moving to 1.6 milestone since we've released 1.5.

@mikebrow
Copy link
Member

mikebrow commented Feb 7, 2022

would like to see the containerd debug log (containerd -l debug) for this scenario

We added a bunch of interesting debug out and refactored this code around the time and after this was reported.. it is possible your scenario is failing for a non-obvious reason and then trying your configured mirror/default and showing the non-interesting error and thus you don't get to see the actual reason for failure, esp. noting the referenced pull image secret.

@dmcgowan dmcgowan modified the milestones: 1.6, 1.7 Feb 17, 2022
@dmcgowan dmcgowan removed this from the 1.7 milestone Mar 2, 2023
@dosubot dosubot bot added the Stale label Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants