Resource icon

Creating a degraded pool

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
danb35 submitted a new resource:

Creating a degraded pool - Creating an intentionally-degraded pool to allow for later expansion/redundancy

WARNING: HERE BE DRAGONS. IF YOU DON'T KNOW EXACTLY WHAT YOU'RE DOING, DO NOT FOLLOW THESE INSTRUCTIONS.

IN FACT, IF YOU NEED THESE INSTRUCTIONS, YOU PROBABLY SHOULDN'T FOLLOW THEM.


There are a few cases where it may make sense to create a degraded pool in your FreeNAS server. One of the most common would be a case where you have all the disks you need for your pool, but one of them already has data on it. The degraded pool would let you create,...

Read more about this resource...
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
IN FACT, IF YOU NEED THESE INSTRUCTIONS, YOU PROBABLY SHOULDN'T FOLLOW THEM.
I feel this way about a lot of things... Regardless, thank you for the guide!
 

johnnyspo

Dabbler
Joined
Nov 30, 2012
Messages
13
This is a great tip. I believe I am a candidate for this. I have a single drive (same box as source pool) as a target for replication. I want to expand this to the rz2 on a second FreeNAS box, but don’t want to replicate again from the source pool.

Sound reasonable?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I believe I am a candidate for this.
If I'm understanding you correctly, I don't think you are. How are you thinking that this guide would help you do what you're wanting to do?
 

johnnyspo

Dabbler
Joined
Nov 30, 2012
Messages
13
If I'm understanding you correctly, I don't think you are. How are you thinking that this guide would help you do what you're wanting to do?
My thinking is that because the current backup drive has all of the data and will be one of the disks in planned rz2 pool, I can setup the new pool with 4 physical drives and the sparse file. Once the pool is created I can swap the sparse file with physical backup drive and then resilver. Done this way, the data on the backup drive remains “intact” during this process versus creating from the start the rz2 with 5 drives (including the backup drive), where the data on the backup drive will be destroyed and need to be replicated again in its entirety.

If this was your understanding, please explain my error.
 

johnnyspo

Dabbler
Joined
Nov 30, 2012
Messages
13
When you resilver the backup drive into the RAIDZ2 pool, all the data on that backup drive will be lost.
I see now. I went back and re-read the guide and along with your reply, I see my mistake. Thanks for the clarification!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I'm trying to do something a bit different and need to write down my steps in planning and confirming what I'm doing, so I though I would just post my workings here since it uses in large part this resource (I'm fine with a mod moving it somewhere else if that's preferred).

First, what am I doing?
I have a pool with 2 VDEVs, both 8 disks wide, RAIDZ2 (vol5)
I have an additional pool with a single 8-disk vdev RAIDZ2 (vol1)

My ultimate goal is to replace all the disks in vol1 with bigger ones, scrap that pool and make those 8 disks a third VDEV in vol5, carrying over all the content from vol1 to vol5 in the process.

Needless to say that in all of this, I am prepared for the loss of all data on this FreeNAS server before commencing, so in the worst case scenario, I just rebuild it all and restore from backup... but nobody wants to do that, it's no fun and certainly not plan A for me (more like plan Z).

Also, I am doing it this way because I don't want to go above 80% on vol5, which would happen along the path if I just moved the data there directly and I also wanted to balance the VDEVs a bit rather than further stacking the existing 2 VDEVs with all the new data leaving the new VDEV completely empty, hence this less direct route.

So, I have only 24 slots in my chassis, meaning no easy way to just attach the 8 new disks add the vdev of new disks and make that work.

To begin, I will stop all the jails with mounts for vol1 (keeping a list of them to reinstate with the new vol5 locations) and will take note of SMB shares for that pool.

My system dataset is on a different pool (vol6), so I won't need to move that off vol1.

What I will do (since I'm lucky enough to have not too much data on vol1) is:

  1. Remove a disk (offline it) from vol1
    • Using the GUI (Storage | Pools | vol1 ... status: select disk ... offline)
  2. Replace it with one of the new larger disks
    • Physically remove the old and insert the new, nothing in the GUI or CLI here.
    • .... (theory is I could take a little more risk here if I needed the space and make a striped pool with 2 of the new disks if required... then having no redundancy in vol1 while copying... and clearly none in the temp copy in either case)
    • Also worth noting that I record the position in the chassis, capacity, serial number and disk model/type, so need to update those records as we go all throughout this process
  3. Create a temp pool on that single disk
    • zpool create tmpcopy da22
  4. snapshot and zfs send|recv vol1 to it
    • zfs snapshot -r vol1@tempsend
    • zfs send -R vol1@tempsend | pv | zfs recv tmpcopy/vol1 (maybe want to do this in a tmux session as it could take a long time and SSH can get disconnected)
  5. Detach vol1 (not erasing the content) and remove all the old disks (leaving the 1 replaced one with the temp pool in place... 7 slots now open in the chassis)
    • Storage | Pools | vol1 cogwheel Export/Disconnect
    • Then physically remove all the old disks from that pool
    • Also an interesting note a little related to the keeping of records while I do this... Since my VDEVs are important to the health of the pool, I don't want any of the VDEVs to be open to failures due to other components, particularly the backplane (mine is a 6 row system with 4 disks per row), so I make sure that my layout has no more than 2 disks from any one VDEV in any of the 4-disk rows, meaning a backplane failure of one row can't take any pool offline... also a convenient side effect is that the heat distribution is better as VDEVs work independently and one may be working while another is not.
  6. add the 7 new disks into the chassis
    • Physically, nothing in GUI or CLI for this step
  7. prepare all of the disks as per @danb35 's process (burning them all in would be a good thing to do here too before that)
    • gpart create -s gpt /dev/da18
    • gpart add -i 1 -b 128 -t freebsd-swap -s 2g /dev/da18
    • gpart add -i 2 -t freebsd-zfs /dev/da18
    • repeat x7
  8. Create the sparsefile
    • truncate -s 10T /root/sparsefile
  9. checkpoint the pool vol5 (so if it all goes bad I don't kill vol5 or at least have a chance of coming back to the "before" state)
    • zpool checkpoint vol5
  10. Zpool add the new vdev (using a slightly modified version of the process since it's an existing pool, not a new one) using the 7 disks and the sparsefile
    • zpool add -f vol5 raidz2 /root/sparsefile gptid/dee8fa86-c056-11e8-90eb-002590caf340 gptid/e1e2ba67-c056-11e8-90eb-002590caf340 .......... +5 more gptids
    • Offline the sparsefile zpool offline vol5 /root/sparsefile
  11. zfs send|recv vol1 from the temp pool to vol5 (maybe move things around a bit in that process, but still using send|recv for each dataset
    • zfs snapshot -r tmpcopy/vol1@tempsend
    • zfs send -R tmpcopy/vol1/data@tempsend | pv | zfs recv vol5/data ...
  12. check all is OK on vol5
    • zpool status -v vol5
    • ls -l or whatever
  13. Remove the checkpoint
    • zpool checkpoint -d vol5
    • Interesting to note that you can't replace a failed disk on a pool while a checkpoint is in place
  14. destroy the temp pool and wipe that disk
    • zpool destroy -r tmpcopy
    • Wipe the disk in the GUI. Storage | Disks Select disk ... Wipe
  15. use that wiped disk to replace the sparsefile
    • Should work in the GUI. Storage | Pools | vol5 ... Status Select the sparsefile disk ... replace (select the wiped disk)
  16. delete the sparsefile
    • rm /root/sparsefile
Then re-do the jail mounts and re-share for SMB and start the jails.

So I have several thoughts about what could go wrong at which points and what I would do about that:

If the new drive fails between steps 4 to 13:
I have all the data on the 7 original disks from the vol1 pool, just back out to the beginning and start again after RMA of the bad disk.

If a disk fails in vol1 between steps 4 and 5:
Sweat it out and live with 6 disks and no redundancy or back out and replace the bad disk with a spare and wait for resilver before progressing.

If something goes wrong with step 10:
roll back the checkpoint on vol5
  • zpool export vol5
  • zpool import --rewind-to-checkpoint vol5
  • zpool export vol5
  • Import the pool from the GUI
 
Last edited:

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
As I'm investigating different avenues forward for my storage configuration, I tried this resource guide out.
As it turns out, it requires some tweaks as of TrueNAS-13.0-U4.

The problem I ran into:
At the stage where the pool is reimported, the GUI no longer allows you to do the replacing step from the offlined file.
2023-04-23_18-51.jpg


Turning to CLI, there is not much luck either:
Code:
zpool replace -f blankpool /root/sparsefile gptid/ed391415-e1ef-11ed-8029-ac1f6bb3a54c                       
cannot replace /root/sparsefile with gptid/ed391415-e1ef-11ed-8029-ac1f6bb3a54c: already in replacing/spare config; wait for completion or use 'zpool detach'


The problem appears related to ashift.
When the pool is created as per the guide, it defaults to ashift=9 (which somehow registers as "0" in zpool get all?)
Code:
truenas# zpool get all | grep ashift
blankpool  ashift                         0                             local
boot-pool  ashift                         0                              default
tank         ashift                         12                             local


In order to avoid the error, it would require to specify ashift=9 when replacing the drive.
That would look something like zpool replace -o ashift=9 -f blankpool /root/sparsefile gptid/asdf...
When doing that however, the drive will have cause the output of (sorry didnt save explicit example) zpool status -v blankpool that indicates the replaced drive has 512b instead of 4096, ie indicating explicitly added with ashift 9.

My suggestion is to avoid all of that, to have the pool line up with "what would come out when using the GUI".


Enough problems! Here's a solution that worked out today:
Create the pool specifying ashift=12:
zpool create -f -o ashift=12 blankpool raidz1 /root/sparsefile gptid/f84b869b-e1f2-11ed-8029-ac1f6bb3a54c gptid/f85066f7-e1f2-11ed-8029-ac1f6bb3a54c

Then the replacement will follow nicely at the later steps:
zpool replace -o ashift=12 -f blankpool /root/sparsefile gptid/f3ec7e13-e1f1-11ed-8029-ac1f6bb3a54

This generates a clean looking pool.

Code:
truenas# zpool status -v blankpool
  pool: blankpool
 state: ONLINE
  scan: resilvered 816K in 00:00:00 with 0 errors on Sun Apr 23 18:52:57 2023
config:

        NAME                                            STATE     READ WRITE CKSUM
        blankpool                                       ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/f3ec7e13-e1f1-11ed-8029-ac1f6bb3a54c  ONLINE       0     0     0
            gptid/f84b869b-e1f2-11ed-8029-ac1f6bb3a54c  ONLINE       0     0     0
            gptid/f85066f7-e1f2-11ed-8029-ac1f6bb3a54c  ONLINE       0     0     0

errors: No known data errors
 
Top