Pool layout w/ 16TB disks

MrBucket101

Dabbler
Joined
Jul 9, 2018
Messages
18
My friend has 2x6rz2 with 12TB disks. He recently suffered a disk failure. His rebuild time, was 6 days. My current pool is 1x12rz3 w/ 16TB disks. I’m concerned that if I were to suffer a failure, the rebuild time would be significantly longer than 6 days. So I’ve been trying to plan a new pool, and shuffle my data around so I can rebuild.

The performance of my current pool, is just okay. So I was hoping to add some more stripes to help with that.

Right now, I’m considering purchasing 4 more drives, for a total of 16.

Then I would build a new pool consisting of 4x4rz1. This would give me a nice boost to performance.

My main concern is with rz1 with 16tb disks. I have a complete offsite backup, that I replicate to. So a total failure isn’t a catastrophe, but I’d prefer to build something resilient enough to not matter.

With 4 disks in a single vdev, the rebuild should be fairly quick?

Expansions would also be somewhat cost effective, since I would only need to purchase disks in groups of 4. Instead of 6 or 8.

Pool usage, is primarily bulk data storage, backup, and archival. At any point in time, there’s usually 3-4 people reading/writing from the pool, not including headless VM’s.

Any feedback is greatly appreciated.

Thanks!
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
A 16 TB HDD cannot rebuild quickly, and with raidz1 a disk failure would put quite a lot of data at risk. If you want to avoid restoring from beckup (at least you have it: good!), you'd better stick to raidz2. Or raidz3: You current array would certainly take some time to rebuild, but it is quite resilient.
To increase performance while maintaining some resiliency, I'd suggest 2 * 8-wide raidz2. Or 3 * 6-wide, but that's two more drives.
 

MrBucket101

Dabbler
Joined
Jul 9, 2018
Messages
18
A 16 TB HDD cannot rebuild quickly, and with raidz1 a disk failure would put quite a lot of data at risk. If you want to avoid restoring from beckup (at least you have it: good!), you'd better stick to raidz2. Or raidz3: You current array would certainly take some time to rebuild, but it is quite resilient.
To increase performance while maintaining some resiliency, I'd suggest 2 * 8-wide raidz2. Or 3 * 6-wide, but that's two more drives.
Thanks for confirming what I suspected.

I had read somewhere that the width of a vdev, had the biggest impact on a resilver — but 16TB disks in 4xrz1 just sounded too good to be true.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
You can also set some zfs parms to determine how much to prioritize the resilver, so, you might want better performance and less resilver time. This would mean maybe you don't care how long it takes to complete if the system performs well in the meantime.
 

MrBucket101

Dabbler
Joined
Jul 9, 2018
Messages
18
You can also set some zfs parms to determine how much to prioritize the resilver, so, you might want better performance and less resilver time. This would mean maybe you don't care how long it takes to complete if the system performs well in the meantime.
Do you know which params I would be interested in? I’m googling and reading through the documentation now, but I can’t seem to find something that looks promising.

Outside of “resilver priority” in the Data Protection menu
 
Last edited:

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Sorry, busy with trying to update to Cobia and lots of errors. I was not aware the resilver priority was there, but not on Cobia. Perhaps Scale already runs resilver at low priority and this merely raises it. Someone else can respond.

I was speaking of ZFS parameters, not Truenas. I sort of vaguely recall that I could be wrong on that as they may have been removed. But I can't spend any time on that right now, sorry about that. Too many fires.
 

MrBucket101

Dabbler
Joined
Jul 9, 2018
Messages
18
Sorry, busy with trying to update to Cobia and lots of errors. I was not aware the resilver priority was there, but not on Cobia. Perhaps Scale already runs resilver at low priority and this merely raises it. Someone else can respond.

I was speaking of ZFS parameters, not Truenas. I sort of vaguely recall that I could be wrong on that as they may have been removed. But I can't spend any time on that right now, sorry about that. Too many fires.
No worries. All of the params that I found, that were recommended to tweak, weren’t available to even be changed.

The resilver priority box is the opposite. You define a window with which it’s a lower priority (think normal business hours) and outside of that it’s elevated
 
Top