hexadecagram
Dabbler
- Joined
- Jul 15, 2016
- Messages
- 32
Hello fellow FreeNAS fans,
So I just spent about 4 hours trying to replace the ZIL in an iXsystems FreeNAS Mini XL running 11.1-U6. Everything seems to be fine at this point but I thought I should share my experience, which seems to be likely due to a bug.
Like anyone with a bit of sense, I immediately turned to the User Guide for direction. My first stop was section 8.1.10. Because my pool is encrypted, I was directed to section 8.1.10.1 from there, which in turn directs me to 8.1.1.1. One thing to note there is that L2ARC drives are supposedly not encrypted in encrypted pools, yet mine is GELI'ed. This seems to be a new addition to the User Guide, because I don't recall seeing it before. Perhaps that is the cause of this behavior?
Anyhow, section 8.1.1.1 also states that ZIL drives ARE encrypted in encrypted pools. And as I check my machine, the ZIL is GELI'ed as well, and my interpretation of what is written is that I just need to treat it as just any other encrypted drive while replacing it.
So I go back and start re-reading section 8.1.10.1 and see that I need to add a passphrase to my pool to start the process. Okay, done.
Back to section 8.1.10 now to follow the steps as prescribed in 8.1.10.1. Step 1, offline the drive, done with no complaint from the GUI. Step 2, shutdown and replace the drive, done. Step 3, boot back up and click "Replace Disk", partially done because now the pool isn't being displayed as Locked as it has been in the past. To my surprise it's marked UNKNOWN!
At that point I ssh into the box, do a zpool status and see that each drive in the pool is marked as UNKNOWN. I fiddle around in the GUI a fair bit and get tracebacks from various python scripts as I try to get information about my pool. Uh oh.
So the first thing I do is check /dev. Come to find out that for some reason GELI didn't attach to the drives on boot. I don't remember everything I did after this point and it took a lot of experimentation on my part to come to this result, but here's a rough summary of what I can remember going through to get the pool back online. Note that at certain points drives were UNAVAIL, but exactly when I can't remember.
So I just spent about 4 hours trying to replace the ZIL in an iXsystems FreeNAS Mini XL running 11.1-U6. Everything seems to be fine at this point but I thought I should share my experience, which seems to be likely due to a bug.
Like anyone with a bit of sense, I immediately turned to the User Guide for direction. My first stop was section 8.1.10. Because my pool is encrypted, I was directed to section 8.1.10.1 from there, which in turn directs me to 8.1.1.1. One thing to note there is that L2ARC drives are supposedly not encrypted in encrypted pools, yet mine is GELI'ed. This seems to be a new addition to the User Guide, because I don't recall seeing it before. Perhaps that is the cause of this behavior?
Anyhow, section 8.1.1.1 also states that ZIL drives ARE encrypted in encrypted pools. And as I check my machine, the ZIL is GELI'ed as well, and my interpretation of what is written is that I just need to treat it as just any other encrypted drive while replacing it.
So I go back and start re-reading section 8.1.10.1 and see that I need to add a passphrase to my pool to start the process. Okay, done.
Back to section 8.1.10 now to follow the steps as prescribed in 8.1.10.1. Step 1, offline the drive, done with no complaint from the GUI. Step 2, shutdown and replace the drive, done. Step 3, boot back up and click "Replace Disk", partially done because now the pool isn't being displayed as Locked as it has been in the past. To my surprise it's marked UNKNOWN!
At that point I ssh into the box, do a zpool status and see that each drive in the pool is marked as UNKNOWN. I fiddle around in the GUI a fair bit and get tracebacks from various python scripts as I try to get information about my pool. Uh oh.
So the first thing I do is check /dev. Come to find out that for some reason GELI didn't attach to the drives on boot. I don't remember everything I did after this point and it took a lot of experimentation on my part to come to this result, but here's a rough summary of what I can remember going through to get the pool back online. Note that at certain points drives were UNAVAIL, but exactly when I can't remember.
- Shut the machine down once again, put the old ZIL back in, and it still came back UNKNOWN.
- Shut it down again, reinstalled the new ZIL, and booted up, pool is UNKNOWN yet again.
- Noticed many repeating log messages such as this:
ZFS: vdev state changed, pool_guid=mypoolguid vdev_guid=randomnumber
zpool export mypool; ( for prov in /dev/gptid/*; do geli attach -k /data/geli/mypool.key $prov; done ); zpool import -m mypool
- Clicked "Replace Disk" in the GUI, waited for resilvering to complete, followed the remaining steps in section 8.1.10.1.
- Reboot.
- Once again GELI devices did not attach on boot (but notice that /dev/mirror/swap[0-3].eli are up).
zpool export mypool; ( for prov in /dev/gptid/*; do geli attach -k /data/geli/mypool.key $prov; done ); zpool import mypool
- Change passphrase (reused the same one I used when reading section 8.1.10.1).
- Reboot.
- Yay! Pool is in Locked state.
- Unlock and reboot.
- Again in Locked state.
- Unlock, remove passphrase, and reboot.
- Pool is online.
- Reboot again.
- Pool is online.