Creating a degraded pool

danb35 · Sep 24, 2018

danb35 submitted a new resource:

Creating a degraded pool - Creating an intentionally-degraded pool to allow for later expansion/redundancy

WARNING: HERE BE DRAGONS. IF YOU DON'T KNOW EXACTLY WHAT YOU'RE DOING, DO NOT FOLLOW THESE INSTRUCTIONS.

IN FACT, IF YOU NEED THESE INSTRUCTIONS, YOU PROBABLY SHOULDN'T FOLLOW THEM.

There are a few cases where it may make sense to create a degraded pool in your FreeNAS server. One of the most common would be a case where you have all the disks you need for your pool, but one of them already has data on it. The degraded pool would let you create,...

Read more about this resource...

kdragon75 · Sep 25, 2018

IN FACT, IF YOU NEED THESE INSTRUCTIONS, YOU PROBABLY SHOULDN'T FOLLOW THEM.

I feel this way about a lot of things... Regardless, thank you for the guide!

johnnyspo · Mar 25, 2020

This is a great tip. I believe I am a candidate for this. I have a single drive (same box as source pool) as a target for replication. I want to expand this to the rz2 on a second FreeNAS box, but don’t want to replicate again from the source pool.

Sound reasonable?

danb35 · Mar 26, 2020

johnnyspo said:
I believe I am a candidate for this.

If I'm understanding you correctly, I don't think you are. How are you thinking that this guide would help you do what you're wanting to do?

johnnyspo · Mar 27, 2020

danb35 said:
If I'm understanding you correctly, I don't think you are. How are you thinking that this guide would help you do what you're wanting to do?

My thinking is that because the current backup drive has all of the data and will be one of the disks in planned rz2 pool, I can setup the new pool with 4 physical drives and the sparse file. Once the pool is created I can swap the sparse file with physical backup drive and then resilver. Done this way, the data on the backup drive remains “intact” during this process versus creating from the start the rz2 with 5 drives (including the backup drive), where the data on the backup drive will be destroyed and need to be replicated again in its entirety.

If this was your understanding, please explain my error.

danb35 · Mar 27, 2020

When you resilver the backup drive into the RAIDZ2 pool, all the data on that backup drive will be lost.

johnnyspo · Mar 27, 2020

danb35 said:
When you resilver the backup drive into the RAIDZ2 pool, all the data on that backup drive will be lost.

I see now. I went back and re-read the guide and along with your reply, I see my mistake. Thanks for the clarification!

sretalla · May 28, 2020

I'm trying to do something a bit different and need to write down my steps in planning and confirming what I'm doing, so I though I would just post my workings here since it uses in large part this resource (I'm fine with a mod moving it somewhere else if that's preferred).

First, what am I doing?
I have a pool with 2 VDEVs, both 8 disks wide, RAIDZ2 (vol5)
I have an additional pool with a single 8-disk vdev RAIDZ2 (vol1)

My ultimate goal is to replace all the disks in vol1 with bigger ones, scrap that pool and make those 8 disks a third VDEV in vol5, carrying over all the content from vol1 to vol5 in the process.

Needless to say that in all of this, I am prepared for the loss of all data on this FreeNAS server before commencing, so in the worst case scenario, I just rebuild it all and restore from backup... but nobody wants to do that, it's no fun and certainly not plan A for me (more like plan Z).

Also, I am doing it this way because I don't want to go above 80% on vol5, which would happen along the path if I just moved the data there directly and I also wanted to balance the VDEVs a bit rather than further stacking the existing 2 VDEVs with all the new data leaving the new VDEV completely empty, hence this less direct route.

So, I have only 24 slots in my chassis, meaning no easy way to just attach the 8 new disks add the vdev of new disks and make that work.

To begin, I will stop all the jails with mounts for vol1 (keeping a list of them to reinstate with the new vol5 locations) and will take note of SMB shares for that pool.

My system dataset is on a different pool (vol6), so I won't need to move that off vol1.

What I will do (since I'm lucky enough to have not too much data on vol1) is:

Remove a disk (offline it) from vol1
- Using the GUI (Storage | Pools | vol1 ... status: select disk ... offline)
Replace it with one of the new larger disks
- Physically remove the old and insert the new, nothing in the GUI or CLI here.
- .... (theory is I could take a little more risk here if I needed the space and make a striped pool with 2 of the new disks if required... then having no redundancy in vol1 while copying... and clearly none in the temp copy in either case)
- Also worth noting that I record the position in the chassis, capacity, serial number and disk model/type, so need to update those records as we go all throughout this process
Create a temp pool on that single disk
- zpool create tmpcopy da22
snapshot and zfs send|recv vol1 to it
- zfs snapshot -r vol1@tempsend
- zfs send -R vol1@tempsend | pv | zfs recv tmpcopy/vol1 (maybe want to do this in a tmux session as it could take a long time and SSH can get disconnected)
Detach vol1 (not erasing the content) and remove all the old disks (leaving the 1 replaced one with the temp pool in place... 7 slots now open in the chassis)
- Storage | Pools | vol1 cogwheel Export/Disconnect
- Then physically remove all the old disks from that pool
- Also an interesting note a little related to the keeping of records while I do this... Since my VDEVs are important to the health of the pool, I don't want any of the VDEVs to be open to failures due to other components, particularly the backplane (mine is a 6 row system with 4 disks per row), so I make sure that my layout has no more than 2 disks from any one VDEV in any of the 4-disk rows, meaning a backplane failure of one row can't take any pool offline... also a convenient side effect is that the heat distribution is better as VDEVs work independently and one may be working while another is not.
add the 7 new disks into the chassis
- Physically, nothing in GUI or CLI for this step
prepare all of the disks as per @danb35 's process (burning them all in would be a good thing to do here too before that)
- gpart create -s gpt /dev/da18
- gpart add -i 1 -b 128 -t freebsd-swap -s 2g /dev/da18
- gpart add -i 2 -t freebsd-zfs /dev/da18
- repeat x7
Create the sparsefile
- truncate -s 10T /root/sparsefile
checkpoint the pool vol5 (so if it all goes bad I don't kill vol5 or at least have a chance of coming back to the "before" state)
- zpool checkpoint vol5
Zpool add the new vdev (using a slightly modified version of the process since it's an existing pool, not a new one) using the 7 disks and the sparsefile
- zpool add -f vol5 raidz2 /root/sparsefile gptid/dee8fa86-c056-11e8-90eb-002590caf340 gptid/e1e2ba67-c056-11e8-90eb-002590caf340 .......... +5 more gptids
- Offline the sparsefile zpool offline vol5 /root/sparsefile
zfs send|recv vol1 from the temp pool to vol5 (maybe move things around a bit in that process, but still using send|recv for each dataset
- zfs snapshot -r tmpcopy/vol1@tempsend
- zfs send -R tmpcopy/vol1/data@tempsend | pv | zfs recv vol5/data ...
check all is OK on vol5
- zpool status -v vol5
- ls -l or whatever
Remove the checkpoint
- zpool checkpoint -d vol5
- Interesting to note that you can't replace a failed disk on a pool while a checkpoint is in place
destroy the temp pool and wipe that disk
- zpool destroy -r tmpcopy
- Wipe the disk in the GUI. Storage | Disks Select disk ... Wipe
use that wiped disk to replace the sparsefile
- Should work in the GUI. Storage | Pools | vol5 ... Status Select the sparsefile disk ... replace (select the wiped disk)
delete the sparsefile
- rm /root/sparsefile

Then re-do the jail mounts and re-share for SMB and start the jails.

So I have several thoughts about what could go wrong at which points and what I would do about that:

If the new drive fails between steps 4 to 13:
I have all the data on the 7 original disks from the vol1 pool, just back out to the beginning and start again after RMA of the bad disk.

If a disk fails in vol1 between steps 4 and 5:
Sweat it out and live with 6 disks and no redundancy or back out and replace the bad disk with a spare and wait for resilver before progressing.

If something goes wrong with step 10:
roll back the checkpoint on vol5

zpool export vol5
zpool import --rewind-to-checkpoint vol5
zpool export vol5
Import the pool from the GUI

Dice · Apr 23, 2023

As I'm investigating different avenues forward for my storage configuration, I tried this resource guide out.
As it turns out, it requires some tweaks as of TrueNAS-13.0-U4.

The problem I ran into:
At the stage where the pool is reimported, the GUI no longer allows you to do the replacing step from the offlined file.

Turning to CLI, there is not much luck either:

Code:

zpool replace -f blankpool /root/sparsefile gptid/ed391415-e1ef-11ed-8029-ac1f6bb3a54c                       
cannot replace /root/sparsefile with gptid/ed391415-e1ef-11ed-8029-ac1f6bb3a54c: already in replacing/spare config; wait for completion or use 'zpool detach'

The problem appears related to ashift.
When the pool is created as per the guide, it defaults to ashift=9 (which somehow registers as "0" in zpool get all?)

Code:

truenas# zpool get all | grep ashift
blankpool  ashift                         0                             local
boot-pool  ashift                         0                              default
tank         ashift                         12                             local

In order to avoid the error, it would require to specify ashift=9 when replacing the drive.
That would look something like zpool replace -o ashift=9 -f blankpool /root/sparsefile gptid/asdf...
When doing that however, the drive will have cause the output of (sorry didnt save explicit example) zpool status -v blankpool that indicates the replaced drive has 512b instead of 4096, ie indicating explicitly added with ashift 9.

My suggestion is to avoid all of that, to have the pool line up with "what would come out when using the GUI".

Enough problems! Here's a solution that worked out today:
Create the pool specifying ashift=12:

zpool create -f -o ashift=12 blankpool raidz1 /root/sparsefile gptid/f84b869b-e1f2-11ed-8029-ac1f6bb3a54c gptid/f85066f7-e1f2-11ed-8029-ac1f6bb3a54c

Then the replacement will follow nicely at the later steps:
zpool replace -o ashift=12 -f blankpool /root/sparsefile gptid/f3ec7e13-e1f1-11ed-8029-ac1f6bb3a54

This generates a clean looking pool.

Code:

truenas# zpool status -v blankpool
  pool: blankpool
 state: ONLINE
  scan: resilvered 816K in 00:00:00 with 0 errors on Sun Apr 23 18:52:57 2023
config:

        NAME                                            STATE     READ WRITE CKSUM
        blankpool                                       ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/f3ec7e13-e1f1-11ed-8029-ac1f6bb3a54c  ONLINE       0     0     0
            gptid/f84b869b-e1f2-11ed-8029-ac1f6bb3a54c  ONLINE       0     0     0
            gptid/f85066f7-e1f2-11ed-8029-ac1f6bb3a54c  ONLINE       0     0     0

errors: No known data errors

Important Announcement for The TrueNAS Community.

Creating a degraded pool

danb35

Hall of Famer

kdragon75

Wizard

johnnyspo

Dabbler

danb35

Hall of Famer

johnnyspo

Dabbler

danb35

Hall of Famer

johnnyspo

Dabbler

sretalla

Powered by Neutrality

Dice

Wizard

Similar threads