ZFS device removal - what about pools with RAIDZ vdevs?

Status
Not open for further replies.

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
This thread was spun off from here:
https://forums.freenas.org/index.ph...me-with-4-additionnal-hdds.69191/#post-475644

tl;dr - OP has the following pool:
Code:
tank
	RAIDZ3-0
		disk0
		disk1
		disk2
		disk3
		disk4
		disk5
		disk6
	single-disk0
	single-disk1
	single-disk2
	single-disk3


Naturally, the four single disks need to go. This spawned the question of whether ZFS device removal, in its initial FreeNAS 11.2 state, would work.


Device removal would be awesome here, but I'm pretty sure it doesn't work if there are RAIDZ vdevs...
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Device removal would be awesome here, but I'm pretty sure it doesn't work if there are RAIDZ vdevs...

As a sidebar, this would actually work. The striped drives are individual top-level vdevs and qualify for device removal, but it wouldn't address the fact that the RAIDZ3 can't ever be expanded in the way the OP wants ("just add four more data drives" to the existing vdev) at present. We need to wait for RAIDZ vdev expansion to be implemented.

For RAIDZ, vdev removal does work, but you can only remove at the vdev level, not individual drive, so he's stuck with what he's got.

Edit: Original thread context is this:
https://forums.freenas.org/index.ph...-raidz3-volume-with-4-additionnal-hdds.69191/
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
It does? I was under the impression that the pool could only have mirror-type vdevs.
As I understood the presentation (of which I just watched the video in the last few days), you can only remove stripe or mirror vdevs--but I may have misunderstood it.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I asked on IRC, on the #openzfs channel, and the answer I got matches my understanding that all vdevs must be mirrors/single disks.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
As I understood the presentation (of which I just watched the video in the last few days), you can only remove stripe or mirror vdevs--but I may have misunderstood it.
That's what's implied, but it's vague enough to keep me from just saying "update to 11.2-Beta and use device removal there". I'll see if I can validate my info.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
the answer I got matches my understanding that all vdevs must be mirrors/single disks.
If that's the case, it will render the feature much less useful.
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,556
That's what's implied, but it's vague enough to keep me from just saying "update to 11.2-Beta and use device removal there". I'll see if I can validate my info.
But this would also imply there is no data on those vdevs?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
This is a hell of a sidebar and should probably be in a separate thread. Ahem. @Ericloewe ;)

I asked on IRC, on the #openzfs channel, and the answer I got matches my understanding that all vdevs must be mirrors/single disks.

OpenZFS must have a different implementation then from Oracle ZFS of "vdev removal" although I was pretty sure the intention was feature parity.

In the Solaris 11.4 release trumpting it, one of the example cases was correcting an issue where a pool with a single RAIDZ1 got an SSD mistakenly added as a pool device instead of cache:

https://blogs.oracle.com/solaris/oracle-solaris-zfs-device-removal

But the reference document says "removing toplevel data vdevs in a RAID-Z pool is unsupported"

https://docs.oracle.com/cd/E37838_01/html/E61017/remove-devices.html

As I understand it, you can't remove the RAIDZ vdev, but if you screw up and add a single drive as a stripe, you can still remove that one. It just can't remove the RAIDZ one because it can't make safe assumptions about the space map like it can with a mirror vdev.

However I don't have access to Oracle Solaris 11.4 to check this out personally.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Hey, now that we're in a new thread -

Wouldn't there have been some warning in the GUI when the initial "four-drives-as-stripes" was added? Isn't it in all ports of ZFS where it will scream "mismatched replication level" if you add a vdev with a different level of failure tolerance?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Wouldn't there have been some warning in the GUI when the initial "four-drives-as-stripes" was added?
Yes, there would have been. With bold red text saying what happens. And you need to switch into manual mode to make it happen at all. But nonetheless, it continues to happen.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Yes, there would have been. With bold red text saying what happens. And you need to switch into manual mode to make it happen at all. But nonetheless, it continues to happen.
Well then.

OP of that other thread owes his colleague a slap with a large trout.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
OP of that other thread owes his colleague a slap with a large trout.
No, can't be smaller than a salmon. At least this wasn't a misguided attempt at disk replacement.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
This is a hell of a sidebar and should probably be in a separate thread. Ahem. @Ericloewe ;)



OpenZFS must have a different implementation then from Oracle ZFS of "vdev removal" although I was pretty sure the intention was feature parity.

In the Solaris 11.4 release trumpting it, one of the example cases was correcting an issue where a pool with a single RAIDZ1 got an SSD mistakenly added as a pool device instead of cache:

https://blogs.oracle.com/solaris/oracle-solaris-zfs-device-removal

But the reference document says "removing toplevel data vdevs in a RAID-Z pool is unsupported"

https://docs.oracle.com/cd/E37838_01/html/E61017/remove-devices.html

As I understand it, you can't remove the RAIDZ vdev, but if you screw up and add a single drive as a stripe, you can still remove that one. It just can't remove the RAIDZ one because it can't make safe assumptions about the space map like it can with a mirror vdev.

However I don't have access to Oracle Solaris 11.4 to check this out personally.

After the fork, they’ve diverged completely. You can not assume anything about an Open ZFS implementation of a feature based on Oracle docs (post fork)
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Well, really, this should be straightforward enough to determine experimentally on a VM. I'm working on it, but may not be able to get an answer before I have to leave for work.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The warning looks like this:
upload_2018-8-15_6-27-47.png

Edit: And it simply isn't possible to add the disk to the pool this way--you have to go to Manual Setup to do it.
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
So set up a VM with 11.2-BETA2 and created a RAIDZ1 pool with three disks:
Code:
root@freenas:~ # zpool status tank
  pool: tank
 state: ONLINE
  scan: none requested
config:

   NAME											STATE	 READ WRITE CKSUM
   tank											ONLINE	   0	 0	 0
	 raidz1-0									  ONLINE	   0	 0	 0
	   gptid/7bfd1e52-a075-11e8-b077-6173bfff8520  ONLINE	   0	 0	 0
	   gptid/7f616dbe-a075-11e8-b077-6173bfff8520  ONLINE	   0	 0	 0
	   gptid/8383057d-a075-11e8-b077-6173bfff8520  ONLINE	   0	 0	 0

errors: No known data errors
root@freenas:~ # zpool list
NAME		   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG	CAP  DEDUP  HEALTH  ALTROOT
freenas-boot  15.9G  2.24G  13.6G		-		 -	  -	14%  1.00x  ONLINE  -
tank		  89.5G   551M  89.0G		-		 -	 0%	 0%  1.00x  ONLINE  /mnt


Decided to hate my data and stripe in a fourth disk, bypassing the warning I posted above:
Code:
root@freenas:~ # zpool status tank
  pool: tank
 state: ONLINE
  scan: none requested
config:

   NAME											STATE	 READ WRITE CKSUM
   tank											ONLINE	   0	 0	 0
	 raidz1-0									  ONLINE	   0	 0	 0
	   gptid/7bfd1e52-a075-11e8-b077-6173bfff8520  ONLINE	   0	 0	 0
	   gptid/7f616dbe-a075-11e8-b077-6173bfff8520  ONLINE	   0	 0	 0
	   gptid/8383057d-a075-11e8-b077-6173bfff8520  ONLINE	   0	 0	 0
	 gptid/fc0246fa-a075-11e8-b077-6173bfff8520	ONLINE	   0	 0	 0

errors: No known data errors
root@freenas:~ #

Oh noes, there's data on that striped disk!
Code:
root@freenas:~ # zpool list -v
NAME									 SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG	CAP  DEDUP  HEALTH  ALTROOT
freenas-boot							15.9G  2.24G  13.6G		-		 -	  -	14%  1.00x  ONLINE  -
  da0p2								 15.9G  2.24G  13.6G		-		 -	  -	14%
tank									 119G  14.3G   105G		-		 -	 0%	12%  1.00x  ONLINE  /mnt
  raidz1								89.5G  13.4G  76.1G		-		 -	 0%	14%
	gptid/7bfd1e52-a075-11e8-b077-6173bfff8520	  -	  -	  -		-		 -	  -	  -
	gptid/7f616dbe-a075-11e8-b077-6173bfff8520	  -	  -	  -		-		 -	  -	  -
	gptid/8383057d-a075-11e8-b077-6173bfff8520	  -	  -	  -		-		 -	  -	  -
  gptid/fc0246fa-a075-11e8-b077-6173bfff8520  29.5G   981M  28.5G		-		 -	 0%	 3%
root@freenas:~ #

So, the moment of truth:
Code:
root@freenas:~ # zpool remove tank gptid/fc0246fa-a075-11e8-b077-6173bfff8520
cannot remove gptid/fc0246fa-a075-11e8-b077-6173bfff8520: invalid config; all top-level vdevs must have the same sector size and not be raidz.


So, confirmed: device removal is pretty much worthless.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Perhaps not worthless, but a first step.

This coming weekend I may play with the master of ZFS on Linux again. The following list says device removal, pool checkpoint, and encryption are on ZOL master, (available from GIT);

https://soluble.zgrep.org/zfs.html

I'd hoped that D-RAID would have made it, but the pull request is still working it's way through testing and approvals. It does seem to be getting closer. One issue was that they had to fix a bunch of quirks and bugs to make a new top level vDev type available. Another issue was pool / vDev creation method, (no external program or file was highly desired). But D-RAID can be another thread :smile:.
 
Status
Not open for further replies.
Top