In the next couple years, I expect to hit 80% utilization on my pool. I'm mostly out of bays, so my plan is to expand the pool by replacing the oldest drives with higher capacity models. I'm concerned that after I do that, the new disks will be large enough that I will have increased risk of a dual-failure during re-silver operation. Therefore I want to refactor both vdevs comprising the pool from Z1 to Z2.
Here is the background:
Edit: added another idea for the configuration restore by just swapping the boot pool mirror drives, instead of restoring the database.
Here is the background:
- I am using SCALE
- I have 30 containers running
- Pool is comprised of 2 vdevs
- Each vdev is a Z1 configuration, 5x 2 TB and 5x 4 TB
- I want to refactor each vdev from Z1 to Z2
- A friend has plenty of spare capacity on his Z2 pool on a CORE server
- I have the important data already backed up on GCS and S3, but it's object storage with high retrieval costs (intended for worst-case recovery)
- I have the largest datasets also replicated to a single extra disk on my system (that I normally use as large scratch space), hoping to prevent pulling back the entire set from the friend if possible, but see question about this below
- rsync.net is a great option to enable this procedure too, but with the willing friend I will save a few hundred dollars, and I have one concern that applies to this option as well as the friend option, that I will add to the questions below
- Friend creates a non-superuser account on his CORE server, and I give him my SSH pubkey
- Friend creates a dataset for me and assigns full permissions to my user
- Friend runs `zfs allow <user> create,destroy,diff,mount,readonly,receive,release,send,userprop <pool/dataset>`
- I add this as an SSH Connection on my SCALE server
- I recursively snapshot my pool
- I create a replication task to push all of my datasets to his server, including hundreds of child snapshots (using destination encryption)
- I have ~11 TiB to push over via my 500 Mbps upstream to his 1 Gbps downstream; best case this will take ~3 days
- When the replication finishes, stop all services and containers
- Recursively snapshot again
- Run the replication task again to get any stragglers
- Backup the SCALE database and SSH keys
- Remove one of the boot pool mirror drives to preserve the running configuration
- Reset the SCALE configuration to defaults
- Wipe the pool disks
- Create new vdevs from the same disks, but using Z2 this time instead of Z1, using the original pool name
- Setup the replication tasks again
- Replicate available datasets from the single disk to the new pool (see question about this below)
- Replicate the remaining datasets back from the friend's pool to the new pool
- Swap the boot pool mirror drives to restore the original configuration
Restore the SCALE database- Reboot and hope that all of the services and containers come into operation like nothing happened
- Does ZFS checksum single-disk pools? It seems yes, since when I check the pool status I see the read/write/cksum counters, so I should be able to scrub after replication, and also watch for errors while replicating back to the refactored pool, too. (I understand of course if there is an error, that data is lost; I only want this to optimize the speed at which I can recover from the refactor, and if I encounter any error I will fallback to pulling from the remote server.)
- Assume I am running the latest pool version in SCALE and he is running the latest pool version in CORE; isn't SCALE still ahead on the ZFS version? Can I send datasets to a pool running an older version? Or does the version conflict only cause problems when attempting to import? This could also affect the rsync.net option.
- CORE/FreeBSD has a sysctl to allow non-root to mount ZFS datasets, but I shouldn't need to use this if I'm just replicating and not trying to access the data directly on his server, right?
- If he were to upgrade to SCALE, due to the possible ZFS pool version mismatch described above, SCALE/Linux does not support mounting datasets by non-root users, so if he upgraded to SCALE, would I still be able to replicate to a non-root user since I don't need to mount the datasets on his server?
- If my new pool and datasets are exactly the same names after the refactor, all my services and containers should work after restoring the database, right?
- Any more gotchas I didn't think of?
Edit: added another idea for the configuration restore by just swapping the boot pool mirror drives, instead of restoring the database.
Last edited: