Recovering from hard drive upgrade/resilver ending in inconsistent state

VulcanRidr

Explorer
Joined
Jan 5, 2015
Messages
59
Hope everyone had a great Christmas,

My TrueNAS box consists of 6 4TB drives. However, I found a good deal on 10TB HGST drives. I got two for my birthday, and replaced the first two drives. Well, my gift from my family was the 4 remaining 10TB drives with which to complete the upgrade. The pool is RAID-Z2

So drives 0 and 1 had been upgraded earlierin the year, and I, one each in turn, powered down the system, replaced the 4TB with a 10TB drive, powered up, and under Storage -> Pools -> Status did a Replace, then let the system resilver. When it completed, I moved on to the next drive. All was going on swimmingly, until the final drive. It gave me some (72) write errors, then the system rebooted. When it came up, the write errors had cleared, and it completed the resilver. However, it still shows the pool as DEGRADED, and zpool status on the data pool shows:
Code:
root@luna[~]# zpool status NX80101             
  pool: NX80101
 state: DEGRADED
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: resilvered 9.57G in 13:24:31 with 0 errors on Thu Dec 28 14:26:02 2023
config:

        NAME                                                  STATE     READ WRITE CKSUM
        NX80101                                               DEGRADED     0     0     0
          raidz2-0                                            DEGRADED     0     0     0
            gptid/f875bb06-1488-11ee-abb2-0cc47a7e13b8.eli    ONLINE       0     0     0
            gptid/39f81b1f-2f31-11ee-b805-0cc47a7e13b8.eli    ONLINE       0     0     0
            gptid/8d46a0c4-a428-11ee-9f7c-0cc47a7e13b8.eli    ONLINE       0     0     0
            gptid/64879c61-a385-11ee-8760-0cc47a7e13b8.eli    ONLINE       0     0     0
            gptid/56304af5-a4ca-11ee-83e7-0cc47a7e13b8.eli    ONLINE       0     0     0
            replacing-5                                       UNAVAIL      0     0     0  insufficient replicas
              13809399577626154655                            UNAVAIL      0     0     0  was /dev/gptid/99538994-abb8-11eb-ba64-0cc47a7e13b8.eli
              gptid/87677b78-a546-11ee-b4dc-0cc47a7e13b8.eli  REMOVED      0     0     0

errors: No known data errors


Now in the Pool Status display in the GUI (https://imgur.com/BWelmtm), it gives me a number of options when I click on the vertical dots on either of the /dev/gptid, it gives me the options to Edit, Offline, Replace, Remove, or Detach. I tried clicking on Replace, but there are no Member Disks to choose.

So a couple of questions. There are two /dev/gptid devices, one showing UNAVAIL and one showing REMOVED. I am taking an educated guess that the REMOVED one is the 10TB that had an interrupted resilver, and the UNAVAIL is the 10TB one. How can I get this drive back into the pool? Detach seems to be a permanent solution, and Remove seems almost as permanent. Since the resilver completed, is there a way to get it to recognize the 6th drive?

Thanks,
--vr
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Try zpool clear NX80101.
Run zpool status -g NX80101 and see which drive is which, no need for guesswork.
 
Last edited:

VulcanRidr

Explorer
Joined
Jan 5, 2015
Messages
59
Try zpool clear NX80101.
Run zpool status -g NX80101 and see which drive is which, no need for guesswork.
Thanks for your response, @Davvo,

Turns out the drive actually was bad, and that the completion of the resilver was a bit of a red herring. I shut down the NAS and pulled the drive, and tried to format it under the gparted live CD on a different machine. Except for a stream of error messages on boot, the drive was not detected by gparted either.

I already knew which drive it is, since I am in the middle of the upgrade, and swapping the old drives out one at a time and allowing to resilver...Plus I am also labeling the drive slot that the drive lives in with the serial number of the drive.
 
Top