Upgrade from Bluefin to Cobia - Can't unlock pools from GUI, Kubernetes/Apps missing k3s won't start

ExR90

Cadet
Joined
Sep 30, 2017
Messages
7
Went from 22.12.4.2 to 23.10.1. On boot I was not able to unlock the pool via GUI, saying it could not find data and would need to rename my dataset folder, which sounded bad so I did not continue. Found an article saying to try doing via SSH (zfs load-key -a ) which unlocked my pools just fine. The pools are healthy too, no missing disks etc. No unsafe shutdowns. That works; can't unlock via GUI though but that is minor.

The main problem now is:

All of the containers aren't running, as k3s is kaput. I bet the preceding issue is related as the ix-applications folder is located within one of the pools I could not unlock via GUI. I normally have to restart my apps after unlocking the pools. However I can't seem to do that now even with the pools now unlocked. Rebooting does not help, I still have to unlock by hand and the apps are broken.

Trying to start k3s leads to error:
sudo systemctl start k3s
Job for k3s.service failed because of unavailable resources or another system error.
See "systemctl status k3s.service" and "journalctl -xeu k3s.service" for details.

Running "journalctl -xeu k3s.service" has the following on loop:
░░ The job identifier is 11730 and the job result is done.
Jan 04 02:34:25 truenas systemd[1]: k3s.service: Failed to load environment files: No such file or directory
Jan 04 02:34:25 truenas systemd[1]: k3s.service: Failed to run 'start-pre' task: No such file or directory
Jan 04 02:34:25 truenas systemd[1]: k3s.service: Failed with result 'resources'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support

Also looking at the "sudo journalctl -u k3s -n 1000" I get the following on loop:

Jan 04 02:28:02 truenas systemd[1]: k3s.service: Scheduled restart job, restart counter is at 76.
Jan 04 02:28:02 truenas systemd[1]: Stopped k3s.service - Lightweight Kubernetes.
Jan 04 02:28:02 truenas systemd[1]: k3s.service: Failed to load environment files: No such file or directory
Jan 04 02:28:02 truenas systemd[1]: k3s.service: Failed to run 'start-pre' task: No such file or directory
Jan 04 02:28:02 truenas systemd[1]: k3s.service: Failed with result 'resources'.
Jan 04 02:28:02 truenas systemd[1]: Failed to start k3s.service - Lightweight Kubernetes.

The folder that kubernetes is using in the Apps config is there and I do see config files within the folder.

-rw-r--r-- 1 root root 60 Jan 4 00:09 app_migrations.json
drwxr-xr-x 5 root root 5 Jan 4 00:30 backups
drwxr-xr-x 4 root root 4 Jan 3 17:59 catalogs
-rw-r--r-- 1 root root 406 Nov 4 18:39 config.json
drwxr-xr-x 2 root root 2 Feb 9 2023 default_volumes
drwx--x--- 14 root root 14 Nov 4 18:39 docker
drwxr-xr-x 5 root root 5 Mar 2 2023 k3s
-rw-r--r-- 1 root root 89 Nov 4 18:39 migrations.json
drwxr-xr-x 15 root root 15 Jan 4 00:19 releases

Any ideas?
 

ABain

Bug Conductor
iXsystems
Joined
Aug 18, 2023
Messages
172
One suggestion would be, after unlocking the pools (which I think is the state you are in now) to reset the Apps pool. I would recommend collecting a debug before doing this and if you continue to have problems file a bug report with the debug before and after trying this step.
 

ExR90

Cadet
Joined
Sep 30, 2017
Messages
7
One suggestion would be, after unlocking the pools (which I think is the state you are in now) to reset the Apps pool. I would recommend collecting a debug before doing this and if you continue to have problems file a bug report with the debug before and after trying this step.
You're a gentlmen and a scholar. I made snapshots first of the app datasets, unset the pool, then "set the pool" selecting the same one as before and now it is happy. All the apps came back but were not accessible. Had to remove Metallb and re-add it as well as reconfigure the IP pool but then all apps worked again.
 
Top