Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bind rke kubelet docker config path to default docker config path #3002

Merged
merged 1 commit into from
Aug 5, 2022

Conversation

kinarashah
Copy link
Member

@kinarashah kinarashah commented Aug 4, 2022

Add bind /var/lib/kubelet/config.json:/.docker/config.json

Issue:

Pod creation failing with following error when using private registry:
E0801 21:49:55.450436 6271 kuberuntime_manager.go:815] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed pulling image \"ec2-3-144-97-102.us-east-2.compute.amazonaws.com/rancher/mirrored-pause:3.6\": Error response from daemon: Get https://ec2-3-144-97-102.us-east-2.compute.amazonaws.com/v2/rancher/mirrored-pause/manifests/3.6: no basic auth credentials" pod="kube-system/rke-network-plugin-deploy-job-92w6t"
rancher/rancher#38473

Problem

RKE stores docker config at /var/lib/kubelet/config.json (https://github.com/rancher/rke-tools/blob/master/entrypoint.sh#L75) , but the external cri-dockerd relies on k8s credential plugin which reads only from default docker config locations (https://github.com/kubernetes/kubernetes/blob/master/pkg/credentialprovider/config.go#L136). No auth information is currently found so cri-dockerd defaults to pulling sandbox image (pause image) without credentials and provisioning breaks when using private registry (https://github.com/Mirantis/cri-dockerd/blob/master/core/sandbox_helpers.go#L422)

Solution

The solution is to mount rke's default config.json to /.docker/config.json path where cri-dockerd expects it.

Testing

rke stores auth info in /var/lib/kubelet/config.json but cri-dockerd
relies on k8.io credential provider which uses only default config
provider, this allows cri-dockerd to pull sandbox pause image using
private registry
@kinarashah kinarashah requested review from jiaqiluo, Oats87 and a team August 4, 2022 20:06
Copy link
Member

@jiaqiluo jiaqiluo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiaqiluo jiaqiluo requested a review from a team August 4, 2022 21:05
Comment on lines 513 to +516
if parsedRangeAtLeast124(parsedVersion) {
CommandArgs["container-runtime-endpoint"] = "unix:///var/run/cri-dockerd.sock"
Binds = []string{fmt.Sprintf("%s:/var/lib/cri-dockerd:z", path.Join(host.PrefixPath, "/var/lib/cri-dockerd"))}
Binds = []string{fmt.Sprintf("%s:/var/lib/cri-dockerd:z", path.Join(host.PrefixPath, "/var/lib/cri-dockerd")),
fmt.Sprintf("%s:%s", path.Join(host.PrefixPath, KubeletDockerConfigPath), "/.docker/config.json")}
Copy link
Member

@jiaqiluo jiaqiluo Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @kinarashah , Although I approved the PR, I just had this question and want to get confirmation from you:

The binds are added when the cluster version is at least 1.24, so they are not applied when cri-dockerd is enabled on cluster <= 1.23, right? Why do we do it this way, is it because the problem does not exist for the older version of cri-dockerd used on cluster <= 1.23?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope you're right, this should be outside the v1.24 block but under IsCRIDockerdEnabled check..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiaqiluo On second thought, I’m going to hold off on not adding change for 1.23 or older clusters…we’ll have to hardcode checks for latest patch versions here in order to avoid cluster auto upgrades due to change in kubelet process.

We do have a workaround if users want to enable cri-dockerd on <=1.23 clusters and use private registry, so I think we should be good there.

The workaround is to add the same bind mount in cluster.yml under services/kubelet/extra_binds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can merge this PR if the change looks good for 1.24 and if needed, will put up a new PR for other k8s versions.

@snasovich snasovich requested a review from a team August 5, 2022 16:11
@jiaqiluo jiaqiluo requested a review from a team August 5, 2022 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants