Pool suddenly offline, CLI import shows corruption.

HeroRareheart

Dabbler
Joined
Mar 5, 2022
Messages
16
HI everyone, I have a bizarre issue. Last night I went to bed and my TrueNAS Scale setup was fine, I checked the terminal in the garage when I came home from work and I saw nothing out of order, and all my applications were working fine. When I woke up this morning however all my apps were offline. Upon checking in via the web GUI I saw two notifications;

Code:
Failed to configure kubernetes cluster for Applications: Missing "pool-0/ix-applications/docker, pool-0/ix-applications/releases, pool-0/ix-applications/k3s" dataset(s) required for starting kubernetes.


and

Code:
Pool pool-0 state is OFFLINE: None


After some snooping about in the web GUI I found myself in the "Storage" tab, where all my disks were labeled unassigned. When I checked "Manage Disks" I could see that all the disks in pool-0, the name of my pool, are labeled as exported. I next tried to import pool-0 via the web GUI but the drop down menu shows no pools available to import. At this point based on some older forum posts I believe I should try to import the pool via the CLI but I do not know what commands to run. All the forum posts I found said to run the zpool command followed by some string of flags but everytime I try to run zpool in the CLI I get the command not found error, am I missing something? Is there a different command now or is this not even what I should be doing at this point?

Edit: I forgot mt TrueNAS version, it's TrueNAS-SCALE-22.12.3.3
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
try sudo zpool
 

HeroRareheart

Dabbler
Joined
Mar 5, 2022
Messages
16
everytime I try to run zpool in the CLI I get the command not found error, am I missing something?
My dumbass was running this as a user, not as root. Using sudo gave me an output for zpool. Running "sudo zpool status" told me that the boot pool needs a ZFS upgrade and running "sudo zpool import" get me this output;

Code:
   pool: pool-0
     id: 3186709160161622824
  state: FAULTED
status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
 config:

        pool-0                                    FAULTED  corrupted data
          raidz2-0                                ONLINE
            4db288eb-e0d9-42ca-a9f1-55d6afdddd8c  ONLINE
            68764cbc-eefb-40a2-b770-6355a0bc0145  ONLINE
            d5b22ee2-043f-4ad9-a7cf-823291d44e9b  ONLINE
            40e755e6-da7d-451d-abb7-f66676430a72  ONLINE
            a1001739-a742-45fb-a40b-b7b883c248f7  ONLINE
            02814913-a5d4-4ad5-b1ff-4b41ff46ffc2  ONLINE
            c886495f-590d-4904-a730-75ca4ad99d64  ONLINE
            d3e7d0c7-7b0c-4801-8a74-6909120588a9  ONLINE


I've also noticed my RAM amount has magically shrunk. If I had to take a rough guess as to what happned based on some errors from a few days ago about ZFS issues I think a stick of RAM has gone bad and is causing mild corruption. Even if that's not how this works it won't hurt for me to run Memtest86+ before continuing any further.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
ECC ram?

I suspect you are right in your diagnosis.
 

HeroRareheart

Dabbler
Joined
Mar 5, 2022
Messages
16
ECC ram?

I suspect you are right in your diagnosis.
It is EEC RAM however Memtest86+ came back with a pass after two full runs, so it's not memory. After running "sudo zpool import -F -n pool-0" I am told I'll lose only 5 seconds of transactions sooooooo I have NO CLUE what happened but everything seems fine.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Did you check the system logs? Surely something there would be a hint, adapter disconnect, whatever. Maybe it's a MB issue.
 
Top