Friday, 2024-04-19

rxruby	Hi, I'm try to installing magnum cluster api in openstack-ansible, I have done created k8s containers, then install k8s, but found error like these :	14:04
rxruby	root@aio1:/opt/openstack-ansible# openstack-ansible osa_ops.mcapi_vexxhost.k8s_install	14:04
rxruby	Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml -e @/etc/openstack_deploy/user_variables_octavia.yml "	14:04
rxruby	[WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an inventory source	14:04
rxruby	[WARNING]: running playbook inside collection osa_ops.mcapi_vexxhost	14:04
rxruby	ERROR! the role 'openstack.osa.lxc_container_setup' was not found in osa_ops.mcapi_vexxhost:ansible.legacy:/etc/ansible/ansible_collections/osa_ops/mcapi_vexxhost/playbooks/roles:/etc/ansible/roles:/etc/ansible/roles/ceph-ansible/roles:/etc/ansible/ansible_collections/osa_ops/mcapi_vexxhost/playbooks	14:04
rxruby	The error appears to be in '/etc/ansible/ansible_collections/osa_ops/mcapi_vexxhost/playbooks/k8s_install.yml': line 36, column 15, but may	14:04
rxruby	be elsewhere in the file depending on the exact syntax problem.	14:04
rxruby	The offending line appears to be:	14:04
rxruby	- import_role:	14:04
rxruby	name: openstack.osa.lxc_container_setup	14:04
rxruby	^ here	14:04
rxruby	EXIT NOTICE [Playbook execution failure]	14:04
rxruby	from the error he did not find the role openstack.osa.lxc_container_setup, how to make the role available ?,	14:06
rxruby	btw, I'm testing in Openstack Antelope	14:06
rxruby	and test with aio openstack-ansible	14:27
andrewbonney	rxruby: there are a few patches that you'll only find in OSA master at the moment as this work is planned for the OSA Caracal release. You'd need to backport these on your deployment for testing with Antelope	14:28
andrewbonney	For that specific error the patch will be https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/900529	14:28
rxruby	Sorry, how to backport this patch, can you give me the exact steps?	14:59
rxruby	I got it, let me try first	15:27
rxruby	I already run the installation, and found new error	15:31
rxruby	TASK [vexxhost.containers.containerd : Create containerd config file] **************************************************************************************	15:31
rxruby	An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.errors.AnsibleFilterError: Failed to import docker-image-py module, ensure it is installed on the controller	15:31
rxruby	fatal: [aio1_k8s_container-77e91d86]: FAILED! => {"changed": false, "msg": "AnsibleFilterError: Failed to import docker-image-py module, ensure it is installed on the controller"}	15:31
rxruby	I try to manually install in k8s_container but it's not work, still same error	15:32
jrosser_	rxruby: it is needed in the ansible-runtime venv on the controller (deployment) node, not the container	16:34
drik_west	Hi all, i am looking for some advice. For a homelab, I have a dev1:RPI4 4GB, dev2:thin client 4c/12GB RAM, dev3: 14C/28T/64GB RAM, dev4: 14C/28T/256GB RAM, dev5: 14C/28T/128GB RAM. To reduce power bill, devs 4 and 5 can't be On all the time, so usage only when needed. Dev1-3 can host the required storage disks. How should i setup infra for OSA in this case ? Usage is casual homelab with k8s. I need OS for work later this year, so thinking t	17:37
rxruby	@jrosser @andrewbonney: I already use ansible-runtime venv, then installation k8s continue, stuck again in this step :	17:44
rxruby	TASK [vexxhost.kubernetes.kubernetes : Initialize cluster] ***************************************************************************************************************	17:44
rxruby	fatal: [aio1_k8s_container-77e91d86]: FAILED! => {"changed": true, "cmd": "kubeadm init --config /etc/kubernetes/kubeadm.yaml --upload-certs --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests\n", "delta": "0:02:16.422277", "end": "2024-04-19 17:37:37.723866", "msg": "non-zero return code", "rc": 1, "start": "2024-04-19 17:35:21.301589", "stderr": "W0419 17:35:21.339245 1813 initconfiguration.g	17:44
rxruby	o:336] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration\nW0419 17:35:30.657363 1813 checks.go:835] detected that the sandbox image \"registry.k8s.io/pause:3.8\" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using \"registry.k8s.io/pause:3.9\" as the CRI sandbox image.\nerror execution phase wait-control-plane:	17:44
rxruby	couldn't initialize a Kubernetes cluster\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["W0419 17:35:21.339245 1813 initconfiguration.go:336] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration", "W0419 17:35:30.657363 1813 checks.go:835] detected that the sandbox image \"registry.k8s.io/pause:3.8\" of the container	17:44
rxruby	runtime is inconsistent with that used by kubeadm. It is recommended that using \"registry.k8s.io/pause:3.9\" as the CRI sandbox image.", "error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "[init] Using Kubernetes version: v1.28.4\n[preflight] Running pre-flight checks\n[preflight] Pulling images required	17:44
rxruby	for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[certs] Using certificateDir folder \"/etc/kubernetes/pki\"\n[certs] Generating \"ca\" certificate and key\n[certs] Generating \"apiserver\" certificate and key\n[certs] apiserver serving cert is	17:44
rxruby	signed for DNS names [aio1-k8s-container-77e91d86 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.29.236.157 172.29.236.101]\n[certs] Generating \"apiserver-kubelet-client\" certificate and key\n[certs] Generating \"front-proxy-ca\" certificate and key\n[certs] Generating \"front-proxy-client\" certificate and key\n[certs] Generating \"etcd/ca\"	17:44
rxruby	certificate and key\n[certs] Generating \"etcd/server\" certificate and key\n[certs] etcd/server serving cert is signed for DNS names [aio1-k8s-container-77e91d86 localhost] and IPs [172.29.236.157 127.0.0.1 ::1]\n[certs] Generating \"etcd/peer\" certificate and key\n[certs] etcd/peer serving cert is signed for DNS names [aio1-k8s-container-77e91d86 localhost] and IPs [172.29.236.157 127.0.0.1 ::1]\n[certs] Generating	17:44
rxruby	\"etcd/healthcheck-client\" certificate and key\n[certs] Generating \"apiserver-etcd-client\" certificate and key\n[certs] Generating \"sa\" key and public key\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Writing \"admin.conf\" kubeconfig file\n[kubeconfig] Writing \"kubelet.conf\" kubeconfig file\n[kubeconfig] Writing \"controller-manager.conf\" kubeconfig file\n[kubeconfig] Writing	17:45
rxruby	\"scheduler.conf\" kubeconfig file\n[etcd] Creating static Pod manifest for local etcd in \"/etc/kubernetes/manifests\"\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[kubelet-start]	17:45
rxruby	Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Starting the kubelet\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 4m0s\n[kubelet-check] Initial timeout of 40s	17:45
rxruby	passed.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz'	17:45
rxruby	failed with error: Get \"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't	17:45
rxruby	running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\":	17:45
rxruby	dial tcp [::1]:10248: connect: connection refused.\n\nUnfortunately, an error has occurred:\n\ttimed out waiting for the condition\n\nThis error is likely caused by:\n\t- The kubelet is not running\n\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)\n\nIf you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:\n\t-	17:45
rxruby	'systemctl status kubelet'\n\t- 'journalctl -xeu kubelet'\n\nAdditionally, a control plane component may have crashed or exited when started by the container runtime.\nTo troubleshoot, list all containers using your preferred container runtimes CLI.\nHere is one example how you may list all running Kubernetes containers by using crictl:\n\t- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a \|	17:45
rxruby	grep kube \| grep -v pause'\n\tOnce you have found the failing container, you can inspect its logs with:\n\t- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'", "stdout_lines": ["[init] Using Kubernetes version: v1.28.4", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two,	17:45
rxruby	depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[certs] Using certificateDir folder \"/etc/kubernetes/pki\"", "[certs] Generating \"ca\" certificate and key", "[certs] Generating \"apiserver\" certificate and key", "[certs] apiserver serving cert is signed for DNS names [aio1-k8s-container-77e91d86 kubernetes	17:45
rxruby	kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.29.236.157 172.29.236.101]", "[certs] Generating \"apiserver-kubelet-client\" certificate and key", "[certs] Generating \"front-proxy-ca\" certificate and key", "[certs] Generating \"front-proxy-client\" certificate and key", "[certs] Generating \"etcd/ca\" certificate and key", "[certs] Generating \"etcd/server\"	17:45
rxruby	certificate and key", "[certs] etcd/server serving cert is signed for DNS names [aio1-k8s-container-77e91d86 localhost] and IPs [172.29.236.157 127.0.0.1 ::1]", "[certs] Generating \"etcd/peer\" certificate and key", "[certs] etcd/peer serving cert is signed for DNS names [aio1-k8s-container-77e91d86 localhost] and IPs [172.29.236.157 127.0.0.1 ::1]", "[certs] Generating \"etcd/healthcheck-client\" certificate and key",	17:46
rxruby	"[certs] Generating \"apiserver-etcd-client\" certificate and key", "[certs] Generating \"sa\" key and public key", "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"", "[kubeconfig] Writing \"admin.conf\" kubeconfig file", "[kubeconfig] Writing \"kubelet.conf\" kubeconfig file", "[kubeconfig] Writing \"controller-manager.conf\" kubeconfig file", "[kubeconfig] Writing \"scheduler.conf\" kubeconfig file", "[etcd]	17:46
rxruby	Creating static Pod manifest for local etcd in \"/etc/kubernetes/manifests\"", "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"", "[control-plane] Creating static Pod manifest for \"kube-apiserver\"", "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-scheduler\"", "[kubelet-start] Writing kubelet environment file	17:46
rxruby	with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"", "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"", "[kubelet-start] Starting the kubelet", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 4m0s", "[kubelet-check] Initial timeout of 40s passed.", "[kubelet-check] It	17:46
rxruby	seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get	17:46
rxruby	\"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or	17:46
rxruby	healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get \"http://localhost:10248/healthz\": dial	17:46
rxruby	tcp [::1]:10248: connect: connection refused.", "", "Unfortunately, an error has occurred:", "\ttimed out waiting for the condition", "", "This error is likely caused by:", "\t- The kubelet is not running", "\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)", "", "If you are on a systemd-powered system, you can try to troubleshoot the error with the following	17:46
rxruby	commands:", "\t- 'systemctl status kubelet'", "\t- 'journalctl -xeu kubelet'", "", "Additionally, a control plane component may have crashed or exited when started by the container runtime.", "To troubleshoot, list all containers using your preferred container runtimes CLI.", "Here is one example how you may list all running Kubernetes containers by using crictl:", "\t- 'crictl --runtime-endpoint unix:///var/run/container	17:46
rxruby	d/containerd.sock ps -a \| grep kube \| grep -v pause'", "\tOnce you have found the failing container, you can inspect its logs with:", "\t- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'"]}	17:46
rxruby	then I change the cgroupDriver from systemd to cgroupfs manually, then the installation continue	17:46
rxruby	after that I found new error :	17:46
rxruby	These is the new error :	17:46
rxruby	TASK [vexxhost.kubernetes.kubernetes : Allow workload on control plane node] *********************************************************************************************	17:47
rxruby	An exception occurred during task execution. To see the full traceback, use -vvv. The error was: urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='172.29.236.101', port=6443): Max retries exceeded with url: /version (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)')))	17:47
jrosser_	rxruby: please use paste.opendev.org or some other paste service instead of putting lots of debug here	18:39
jrosser_	rxruby: i think that you are making this really difficult by wanting to deploy on antelope	18:39
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-ops master: Correct supported release for mcapi_vexxhost https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916536	18:45

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!