ZFS Parity-only write-only emergency disk

SAK

Dabbler
Joined
Dec 9, 2022
Messages
20
It would be cool if ZFS had a layout that allowed a spare disk that was parity-only. This disk would be written-to with parity data for emergency only, and not read from. Eg. If you have a set of striped mirrors, and by some crazy chance lost both disks in a mirror...the single parity-only disk could be there to save the day. I don't believe this exists, right?

I realize it only saves the use of 1 disk, but seems like a good idea nonetheless. Would allow for very high resiliency like z2 but with simpler parity calculation which may have some advantages? The only time I tried resilvering a raidz2 pool (to upgrade disk sizes), the speed was looking so terrible that I gave up and created a new pool on the new disks and transferred over. Still a great solution for disk space utilization of course.

I realize I could create a mirrored raidz1 and then detach a disk, but wouldn't that affect write speed since all the other stripes would be reading from 2 disks and then 1 stripe would be left with only 1 disk to read? Mirrors in a stripe do increase read speed as reads are further striped, correct?
 
Joined
Oct 22, 2019
Messages
3,641
If you have a set of striped mirrors, and by some crazy chance lost both disks in a mirror...the single parity-only disk could be there to save the day.
How? Say you have 2 standard mirror vdevs. Each can hold 2 TiB. You would need an "emergency" parity disk to be at least 4 TiB in capacity (to consider both vdevs)... and even then you could just slap extra disks to create three-way mirrors. (Or use RAIDZ2 in the first place.)

In order to prevent the loss of data from losing all disks in a mirror vdev, you need at minimum a drive with the same capacity as the useable ZFS storage in said vdev. (Which is basically the same concept as upgrading from a two-mirror into a three-way mirror.) If you want the "emergency" disk to support multiple vdevs, its minimum capacity would require impractical sizes. (At which case, you need to reconsider your layout and redundancy ahead of time.)
 

SAK

Dabbler
Joined
Dec 9, 2022
Messages
20
Let's say you had 3 mirrors striped. AA, BB, CC. If each disk and therefore each mirror is 8TB, why couldn't a parity be calculated for ABC and the single dedicated parity disk could be 8TB as well? I realize this may not be built-into ZFS, but it would be cool.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I realize I could create a mirrored raidz1 and then detach a disk
There is no such thing.
Let's say you had 3 mirrors striped. AA, BB, CC. If each disk and therefore each mirror is 8TB, why couldn't a parity be calculated for ABC and the single dedicated parity disk could be 8TB as well?
Because:
In order to prevent the loss of data from losing all disks in a mirror vdev, you need at minimum a drive with the same capacity as the useable ZFS storage in said vdev.
In case this statement raises eyebrows (it should not), I remind the readers of Shannon's Source Coding Theorem. The proposed scheme would have one disk of n bits for each m>1 vdevs of n >> 1 bits, therefore we would need to compress these n*m bits down to n bits because any of the m vdevs could fail. According to the theorem, this would necessitate that the entropy H(X) be less than 1/m, where X is any of the bits.
Under the approximation that these n*m bits are already information that has been compressed down to near the Shannon limit of the original message (while this is not strictly true, we just have to be between 1/2 and 1 for this argument to hold), we can therefore say that the problem is mathematically impossible - or more technically, that it is impossible if we cannot compress the data by at least 2x (best case!).
In practical terms, since we have no practical a priori estimator for the n bits, the problem is always impossible, even if not strictly impossible from an information theory point of view.
I realize this may not be built-into ZFS, but it would be cool.
Sure, in the sense that impossible things would be cool if they were possible.
 
Joined
Oct 22, 2019
Messages
3,641
Also keep in mind that Unraid (yes, I know you never said the name) doesn't actually do that either.

Essentially they use some sort of proprietary software that creates a JBOD of XFS formatted drives (which can vary in size), and then you can slap in "parity-only" drives to hold only "parity data". My guess is it's some sort of "file-based" parity. Correction by @AlexGG: It's sector-by-sector parity; not file-based.

They boast how you can "lose an entire drive, but still be able to recover the files on the other drives!" This is because nothing is striped across drives; nor is any parity saved on the storage drives. (It's akin to losing one of your three XFS formatted drives. You'll lose all the files on Drive A, but still have access to the files on Drive B and C.)

Hence the name "Unraid". There is no real RAID involved.
 
Last edited:

SAK

Dabbler
Joined
Dec 9, 2022
Messages
20
I may need to apologize for my ignorance and/or my ability to communicate. I do so ahead of time. I am definitely not understanding your reply, Ericloewe. I've used RAID over the years. ZFS is newer to me, only been dabbling a couple years.

My understanding of raidz1 is that it is similar to raid5 in that you have n disks +1 of the same size. Let's say the data and parity didn't rotate, so the extra disk only had parity on it. Why would it matter if the data drives are mirrored? You could mirror them as many times as you like, it wouldn't change the number of disks or size of disks needed for parity info since there is no new data.

This is what has me confused.
 
Joined
Oct 22, 2019
Messages
3,641
Let's say the data and parity didn't rotate, so the extra disk only had parity on it.
How does that work? Why not just use a three-way mirror? Each drive in a mirror vdev is basically a replica of everything.
 

SAK

Dabbler
Joined
Dec 9, 2022
Messages
20
Yeah the point is it would offer protection against losing both disks in a 2disk mirror vdev. You could lose any 2 disks in the pool and still be online at the cost of 1 extra disk.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Because a 3 vDev pool, with 2 disks in a Mirror, with a 7th disk as a RAID-3 parity, would mean 2 less disks than 3 x 3 disk Mirrors.

The problem with the scheme, (besides not being supportable by ZFS), is that the RAID-3 parity disk becomes a write bottle neck. Every single write also needs to be written to the RAID-3 parity disk.

Not to mention the dual scrubbing. First of the 3 x Mirrored vDevs. Then of the 3 "striped" vDevs plus RAID-3 parity.

Note that RAID-5 is both striped and rotated parity. While RAID-3 is dedicated disk parity.
 
Joined
Oct 22, 2019
Messages
3,641
You could lose any 2 disks in the pool and still be online at the cost of 1 extra disk.
...and then as you add more vdevs? What happens to your dedicated parity drive?

Before it may have been storing the "parity-only" data to "protect" a total of 6 TiB. Then you add another 2 TiB vdev to your pool. Now what? How would the dedicated parity drive handle this?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...and then as you add more vdevs? What happens to your dedicated parity drive?

Before it may have been storing the "parity-only" data to "protect" a total of 6 TiB. Then you add another 2 TiB vdev to your pool. Now what? How would the dedicated parity drive handle this?
But, then it is still a stripe of 2TB disks, just 4 of them. Similar to RAID-Zx expansion :smile:.
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
Also keep in mind that Unraid (yes, I know you never said the name) doesn't actually do that either.

Essentially they use some sort of proprietary software that creates a JBOD of XFS formatted drives (which can vary in size), and then you can slap in "parity-only" drives to hold only "parity data". My guess is it's some sort of "file-based" parity.

For the sake of arbitrary completeness, Unraid uses disk-based parity with a dedicated parity drive (RAID4, or RAID3, whatever) or two dedicated parity drives. In Unraid, while the data is distributed across the disks file-by-file, the parity is sector-by-sector.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I may need to apologize for my ignorance and/or my ability to communicate. I do so ahead of time. I am definitely not understanding your reply, Ericloewe. I've used RAID over the years. ZFS is newer to me, only been dabbling a couple years.

My understanding of raidz1 is that it is similar to raid5 in that you have n disks +1 of the same size. Let's say the data and parity didn't rotate, so the extra disk only had parity on it. Why would it matter if the data drives are mirrored? You could mirror them as many times as you like, it wouldn't change the number of disks or size of disks needed for parity info since there is no new data.

This is what has me confused.
If you lose a disk in a RAIDZ vdev of k+p disks, with k>1 and p the RAIDZ level in {1, 2, 3}, you lose n/k bits of data. Furthermore RAIDZ is structured so that error correction is done at the vdev level, so the worst-case for a single disk is indeed n/k, because we are not concerned about other vdevs, unlike your proposed solution. What's the worst case for n/k? Well, k can be as low as 1, useless as such a setup would be because at that point each disk is storing n/1=n bits, like a mirror would. So, the entropy, in bits per bit, of the parity we store can be as high a 1. Since 1 is the maximum entropy we can have for our data, it is possible to recover our data.
Compare with your proposed scenario, where the very highest entropy we could accommodate was 0.5 bits per bit, necessitating that the data being stored compress by at least 2x. Since that is not possible in general, we concluded that your scheme could not prevent data loss.
 
Joined
Oct 22, 2019
Messages
3,641
But, then it is still a stripe of 2TB disks, just 4 of them. Similar to RAID-Zx expansion :smile:.
But the OP is referring to the concept of a "parity-only" drive that covers the entire pool.

2 storage vdevs + 1 parity-only drive
3 storage vdevs + 1 parity-only drive
4 storage vdevs + 1 parity-only drive
5 storage vdevs + 1 parity-only drive
And so forth...

There is no striping.

How could this work if go you from 2 storage vdevs to 3? Then 4? You'd have to keep adding more "parity-only" drives to make up for the extra total storage, as you need more capacity to hold "parity-only" data. Imagine you have an 8 TiB "parity-only" drive to cover a total of two RAIDZ devs. Then you expand the pool with a third RAIDZ vdev... will the same 8 TiB parity-only drive compensate for it? (Not to mention it would be radically different than the striping already done in RAIDZ.)

Besides that, imagine how insane writes would be on a pool with multiple RAIDZ vdevs plus this "parity-only" drive. Not only is ZFS calculating parity striped across each individual RAIDZ vdev itself, but it has to calculate additional parity in order to protect across all storage vdevs to write parity on the parity-only drive. That sounds insane.

At least with Unraid, it's simply a JBOD + parity. There is no "inception levels" of parity. Just the parity drive(s), and that's it. The storage drives in Unraid don't contain striping.

To slap parity calculations on top of parity calculations would seem insane for ZFS.


Unraid uses disk-based parity with a dedicated parity drive (RAID4, or RAID3, whatever) or two dedicated parity drives. In Unraid, while the data is distributed across the disks file-by-file, the parity is sector-by-sector.
That makes more sense. Thanks for clearing it up. :smile:

Slightly off-topic: From what I heard, file recovery is still possible if you lose a storage drive in Unraid, even without any parity drives. You can access your files by mounting the remaining drives and access everything like normal (with the exception of the files that happened to be stored on the dead drive.)
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
But the OP is referring to the concept of a "parity-only" drive that covers the entire pool.

2 storage vdevs + 1 parity-only drive
3 storage vdevs + 1 parity-only drive
4 storage vdevs + 1 parity-only drive
5 storage vdevs + 1 parity-only drive
And so forth...

There is no striping.

How could this work if go you from 2 storage vdevs to 3? Then 4? You'd have to keep adding more "parity-only" drives to make up for the extra total storage, as you need more capacity to hold "parity-only" data. Imagine you have an 8 TiB "parity-only" drive to cover a total of two RAIDZ devs. Then you expand the pool with a third RAIDZ vdev... will the same 8 TiB parity-only drive compensate for it? (Not to mention it would be radically different than the striping already done in RAIDZ.)
...
If you treat each Mirror vDev as a data column in a RAID-3, the scheme works. (Well, conceptually... not with ZFS.) Note that when I say "each Mirror vDev", I mean the top level. Since each Mirror vDev may have 2 or 3 sub-Mirrors, they are irrelevant to the RAID-3 scheme on top.

As for expansion, when you add a new Mirror vDev, you have to re-compute the parity. Similar to any normal write to one of the preexisting Mirror vDevs.

Odd thing about this, is that the parity drive would only have to be as large as the largest vDev in the server. And theoretically, this also would work for RAID-Zx vDevs, though the parity drive would have to be as large as the largest RAID-Zx vDev.

Of course, this is all theoretical. It is never going to be something supported by ZFS.


This sort of reminds me of someone that wanted to RAID-Zx a single HDD. They partitioned it up, then RAID-Zx then all the partitions together. The more smaller partitions, the wider the RAID-Zx vDev, the less parity used and more space for data. This protected the user from bad blocks on a single disk, but obviously not disk failure.

Doing that odd ball RAID-Zx on a single HDD was FAR more space efficient that "copies=2".
 
Joined
Oct 22, 2019
Messages
3,641
So what you're saying is we should expect this to be available in TrueNAS SCALE 24.x?
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
Slightly off-topic: From what I heard, file recovery is still possible if you lose a storage drive in Unraid, even without any parity drives. You can access your files by mounting the remaining drives and access everything like normal (with the exception of the files that happened to be stored on the dead drive.)

This is correct. Also, if the filesystem fails in a logical sense, the contents of only one drive are affected, because Unraid has an independent filesystem for each drive. That is unlike regular configurations where everything goes down with a single filesystem.
 

asap2go

Patron
Joined
Jun 11, 2023
Messages
228
It would be cool if ZFS had a layout that allowed a spare disk that was parity-only. This disk would be written-to with parity data for emergency only, and not read from. Eg. If you have a set of striped mirrors, and by some crazy chance lost both disks in a mirror...the single parity-only disk could be there to save the day. I don't believe this exists, right?

I realize it only saves the use of 1 disk, but seems like a good idea nonetheless. Would allow for very high resiliency like z2 but with simpler parity calculation which may have some advantages? The only time I tried resilvering a raidz2 pool (to upgrade disk sizes), the speed was looking so terrible that I gave up and created a new pool on the new disks and transferred over. Still a great solution for disk space utilization of course.

I realize I could create a mirrored raidz1 and then detach a disk, but wouldn't that affect write speed since all the other stripes would be reading from 2 disks and then 1 stripe would be left with only 1 disk to read? Mirrors in a stripe do increase read speed as reads are further striped, correct?
That sounds like dRaid.
Except that only works per vdev not for the entire pool.
But a pool can be extended by adding more vdevs which makes the size of a parity drive for an entire pool theoretically 2^128 Byte.
 
Top