Disk Throughput to Resilver Chart/Data?

joeschmuck · Aug 31, 2017

Do we have something buried in the forums which discusses or quantifies the speed of resilvering related to some hard drive metrics?

I'm looking for something which likely already states that resilvering is directly related to the average write speed of the new drive and read speed of the remaining pool. I assume factors like sustained average read/write speed and IOPS are the main factors (a lot goes into factoring those values out), given all other factors are the same, meaning the other hardware is not a bottleneck.

Why am I asking...
I'm getting ready to purchase new hard drives and I want to reduce the total count of hard drives to either 4 drives in a RAIDZ2 or 3 drives in a 3-way mirror. This means large capacity hard drives which means longer resilver times and I'd like to have an idea on what to expect for times worse case. I'd like to calculate a rough average time to resilver a drive given the drive is at 80% full. Right now I'm seriously contemplating 7200 RPM drives due to the higher transfer rates but IOPS also need to be a factor.

Hum, a new type of calculator would be very cool. If a new calculator was created I'd think a standardized test would need to be created to rate the speed of your pool across the entire surface (maybe bonnie++) and then plug that into the calculator and the capacity of the drive, Poof! you can now figure out how long it would take to resilver a drive at a given capacity.

So if someone knows of a formula or calculator or chart that quantifies what I'm looking for, please share.

I'm still searching the internet, there must be something out there. Maybe it will just be a good educated guess.

Ericloewe · Aug 31, 2017

It's going to be hard to take values from one pool and apply them to another.

If it's any consolation, work is nearing completion that will massively increase scrub/resilver speeds on spinning rust (SSDs get a decent boost too, but they're far less constrained by IOPS).

joeschmuck · Aug 31, 2017

Ericloewe said:
It's going to be hard to take values from one pool and apply them to another.

My original goal is not comparing two different systems/pools for resilvering but rather come up with a way to calculate how long resilvering would take for a specified drive provided the system was not a bottleneck. My next though would be to create a calcualtor that would accept some input from a benchmark (average read/write throughput) from a current pool and then be able to calculate how long it would take to resilver an 80% full pool. If the calculations were within a few hours of real world, I'd be happy, then obsessed on why I can't get it to be more accurate. The becnchmark test along with file fragmentation and how full the pool is, all of that would just be a nightmare to figure out. I'll likely drop it once I figure out the answer for my future pool but it cold keep me busy for a rainy weekend.

I'm running Bonnie++ right now on my pool and it seems to be fairly consistent in the results. I may be able to plug this data into a spreadsheet and see how long it would take to read/write 80% of 8TB. It would of course be an estimate. I could likely prove the calculations if I removed one drive, wiped it clean, and reinstalled it. I think I'll do that on a test pool if I ever get to that point.

Ericloewe said:
If it's any consolation, work is nearing completion that will massively increase scrub/resilver speeds on spinning rust (SSDs get a decent boost too, but they're far less constrained by IOPS).

Heck Yea! I didn't realize there was work being done on this. Is this just changing some of the tunables or is there something more in the code that will be changed making things just run better?

Ericloewe · Aug 31, 2017

joeschmuck said:
Heck Yea! I didn't realize there was work being done on this. Is this just changing some of the tunables or is there something more in the code that will be changed making things just run better?

Instead of blindly following the tree and immediately issuing the reads to the drives, the read requests are cached in RAM and issued as this cache fills, by maximum contiguous size.

So, if you have a large file, but parts of it are gone and replaced with something else. If that something else is found in time, the whole section can be issued as one long read, instead of many small reads. Even if there are small holes, a large read can be issued and the holes ignored, since this is faster than doing all the small reads.

joeschmuck said:
how long it would take to resilver an 80% full pool.

That's the thing, it massively depends on how the pool evolved, at least currently. Sequential scrub/resilver will improve it, but it will still be severely impacted by fragmentation (not just free space fragmentation, traditional fragmentation).

Stux · Aug 31, 2017

Right, currently a fragmented pool will take longer to resilver than a non fragmented pool. As I understand it blocks are resilvered in age order, rather than lba order.

Lba order will vastly speed it up.

Ericloewe · Aug 31, 2017

Stux said:
Right, currently a fragmented pool will take longer to resilver than a non fragmented pool. As I understand it blocks are resilvered in age order, rather than lba order.

Lba order will vastly speed it up.

It's not true LBA order, since that would require as much effort as BPR. The cache is filled in the same order as a traditional scrub/resilver. Large blocks are then issued as it fills up.
You can dedicate more RAM to this process, with diminishing returns.

Every once in while, the whole buffer is emptied to get smaller blocks out of the way and to allow for the operation status to be committed to the pool, so that it can be resumed. This is backwards compatible with older versions of ZFS, which can then finish the scrub like they traditionally would. A minor downside is that you might have to rerun the operation on a larger segment of the pool than with the traditional algorithm, in case of a reboot or whatever. Frankly, nothing of consequence is lost there...

joeschmuck · Aug 31, 2017

Sounds good.

Dice · Sep 1, 2017

Whats a guestimated ETA for this tech reaching the stable release train? 1y?

Ericloewe · Sep 1, 2017

11.1:
https://bugs.freenas.org/issues/25531

joeschmuck · Sep 1, 2017

Ericloewe said:
11.1:
https://bugs.freenas.org/issues/25531

So this year, very good news.

EDIT: I just read a lot about the creation of this new work, very interesting. Yea, it's not just a simple tweak here and there, it's a serious piece of code change the has shown some really great performance benefits, even in a worse case there is some benefit and best case a supperior performance benefit. Very cool!

Important Announcement for the TrueNAS Community.

Disk Throughput to Resilver Chart/Data?

joeschmuck

Old Man

Ericloewe

Server Wrangler

joeschmuck

Old Man

Ericloewe

Server Wrangler

Stux

MVP

Ericloewe

Server Wrangler

joeschmuck

Old Man

Dice

Wizard

Ericloewe

Server Wrangler

joeschmuck

Old Man

Similar threads