jgreco
Resident Grinch
- Joined
- May 29, 2011
- Messages
- 18,680
The new OpenZFS throttle doesn't try to "benchmark your vdevs" to figure out how fast your pool is - you get a set of knobs with the dirty_data tunables, and some relatively sane defaults. With low write pressure, the amount of async I/Os (bulk dirty data flushes) queued to vdevs themselves is low; as you increase the amount of outstanding dirty data in the system, more I/O is queued up until you hit the point at which the throttle starts to apply; then it scales up rapidly towards a maximum delay value (currently 100ms) but this behavior is consistent regardless of pool, boot time, data ingested - there's no "learning" or "benchmarking" aspect, if you want it to behave differently (throttle sooner or later, more aggressively or gradually, or allow for slower or faster vdevs) you need to fiddle with those knobs.
Yet it still behaves as though it does. Emergent second-order effect, perhaps. I don't really care. I came to ZFS years ago for VM block storage and found it largely unusable. I spent a lot of time dredging through stuff to understand what was going on, at a time when this wasn't a widely-understood issue, and came to some practical fixes having wasted far too much time doing so.
I spent less time with the new write throttle, and yet it still proved that it isn't everything that's claimed. It is *better* than it was. It can still be caught off-guard though. I don't care to keep debugging. I have better things to be doing.
Part of the problem may be that I have an unrealistic (ha) expectation that systems should be waiting around for workloads to be dumped on them. I have hypervisors with 40Gbps aggregate network links that average less than 100Mbps of steady production traffic. But when stuff needs to happen, it can and does. I may be positioned somewhat better than average to suddenly be throwing big loads at things.