DrKK
FreeNAS Generalissimo
- Joined
- Oct 15, 2013
- Messages
- 3,630
My last mile is FTTP/FiOS. ;)That won't happen until the US last mile providers get a frickin' clue.
My last mile is FTTP/FiOS. ;)That won't happen until the US last mile providers get a frickin' clue.
Cloud storage is becoming many folks' primary way of using data?
After reading the article online talking about how (and why) RAID5 "died" in 2009, and seeing a reference in the newbie slideshow that even RAID-Z3 will stop being capable of guaranteeing data protection as soon as 2019, I am very curious as to what comes next.
You are so right. Silly me, my brain assumed that since Figure 7 has four sections and it's just labeled "probability" it was 25% per section, but it's just an expansion of the same data on Figure 6. My mistake.
I assume Figure 6 is supposed to be the one actually showing 0-100% on the left, which means RAID5 reaches 100% probability of data loss about 2017. RAID-Z3 (or their very interesting RAID-7.3 terminology) appears to be in pretty good shape till about, what, 2030, at a wild guess?
Thanks for the perspective.
When the drive density starts to approach the mathematical limits of URE, it would seem to me that we will build error correction into the device.
Without going to one of our cable company's "business" plans, backing up any reasonable proportion of my data to "the cloud" is out of the question: our "residential" plan is only 4Mbps up, and there is a monthly cap as well. The "business" plan seems to eliminate the "cap" but is still only 4Mbps up.Cloud storage is becoming many folks' primary way of using data? I mean, sure, it's growing, but, I don't that I would characterize it as taking over traditional data storage. My only cloud storage is emergency backups.
LOL. I should have wrote 'additional' error correction into the device. I was proposing something stronger than a larger ECC block... But also NOT suggesting my idea was original or unique.
Thanks for the nice links. I didn't notice they had to ratchet up the ECC block size.
Do the "Enterprise" drives have better ECC?Right, but every byte the add to ECC means one less byte to store real data. So they have to decide how much to allocate for real data, and how much to allocate for ECC. People don't buy hard drives that have a box that says "now with 4 more bytes of ECC per sector!". People *do* buy the drives that say "6TB" over "5TB".
This is one of those "secret sauce behind the recipe" that 99.9% of sheeple will never know about, never care about, and never bother to question. So long as Brand-X's hard drives aren't super reliable compared to Brand-Y then everything is okay. Even if they are all equally unreliable, that's not a problem that the manufacturer's care about.
They are there to sell you hard drives; as many of them as they can for as much profit as they can. Any choice that doesn't promote at least one of those goals (and preferably both) are not going to happen. So manufacturers are not going to double the ECC on their hard drives tomorrow and tell you that the drive is so much better because they did that. At least, not unless some other company does it first. And will you be the guy that buys that 7TB hard drive for $300 or will you be the guy that buys the 7.5TB drive for the same price. Betting you'll buy the 7.5TB and unknowingly be totally unaware that maybe that 7TB drive has more ECC.
Decisions like this have already been made based on market forces. So unless you are going to start your own company and compete with the other companies to make the most reliable hard drive, you are stuck with what you can buy.
well... AFAIK, enterprise doesn't use hardrives.... :D
The future of the IT professional ...
If you can get the isolinear chips in the right slots, you get to live another day.
Do the "Enterprise" drives have better ECC?
But, Enterprise drives claim a lower rate of UREs, which *is* the mathematical chances of a read error,
which is what this whole thread *is* actually talking about even if not everyone knows that.
However, it is worth noting that that this has always been a dicey metric to begin with, and probably doesn't translate to useful data, in the same way that MTBF isn't really directly meaningful. Nonrecoverable read errors aren't likely to magically all be 1x10^14 for consumer grade drives and 1x10^15 for the enterprise drives, across many years, underlying technology changes, etc. It MAY be indicative of somewhat better materials/design/etc but it is also definitely indicative of the fact that they'd prefer enterprises to buy the more pricey drives.
The whole point of RAID, however, was to create an redundant array of inexpensive disks, and to tackle the problem that way. The difference between 1x10^14 and 1x10^15 isn't particularly meaningful in that context, because, again, data loss is tied to the probability of two drives losing the same block simultaneously - not just two drives losing some arbitrary unrelated blocks simultaneously.
It's actually really only tangentially related, and I'm kinda surprised you'd say such a thing. What you're actually looking for is the likelihood of data loss on a pool, and how we can affect that in the future.
RAIDZ1 dies "in 2009" for a very specific reason: the loss of the parity disk results in the elimination of redundancy for the pool. When you're rebuilding, you actually do need each and every sector on the remaining drives that contains pool data to be readable, or you will encounter some loss of data. That is very much intertwined with the URE values you're discussing.
RAIDZ2, however, retains redundancy. Because of that, the URE values are of less concern. As long as the redundancy is capable of recovering the data, you're still fine. The problem with RAIDZ2 is that if you lose a drive, any block on the remaining drives which falls victim to a URE is still recoverable, but has effectively lost redundancy. Still, it is totally recoverable.
We do run into a problem with that, however, as the rebuild times increase. The likelihood of a second drive failing during a multi-day rebuild with these modern large drives is substantially greater than the chance of failure striking during the rebuild of a much smaller drive.
RAIDZ3 extends that out further. At this point, the impact of the URE rate is essentially meaningless, because you're multiply covered even for two failures. Again, as I pointed out earlier on, this is actually a problem in statistics, and statistically speaking, you're very likely to retain availability of a data block as long as you haven't managed to lose access to that block either due to a URE on that same block on the other drive, or lose that drive entirely.
As we move out from RAIDZ1 to RAIDZ3, the statistical likelihood of URE's simultaneously affecting enough replicas of a block to render the block unavailable becomes more and more unlikely, making the issue of URE's less and less important. What's becoming more and more important is MTBF of a drive, since loss of a drive renders ALL blocks on the drive as unrecoverable, and the window during which reduced redundancy exists continues to grow due to the massive size of modern drives.
Increasing the RAIDZ level reduces the importance of URE's from fairly important to almost meaningless.
Increasing the size of the drive increases the resilvering time, which provides a larger window of reduced redundancy for the pool.
These are actually independent variables, but they can cooperate to help build a more reliable pool.
It's meaty posts like this that make me glad I started this thread.
For sure the probability of losing a second or even a third drive during ever-longer rebuild times is a big factor in why even RAID-Z2 makes me nervous on more than about 6 or 7-disk arrays.
According to the Backblaze blog they've had annual failure rates of around 30% with some models of 1.5TB & 3TB Seagates.
If I'd been unlucky enough to choose those drives for a 9-disk array, even RAID-Z3 would have been hard pressed to keep that array functional.
Couple those failure rates with the fact that in the real world an array will often be built with drives from the same manufacturing batch that experience identical usage, vibration and heat conditions, and you've got a drastically higher probability of multiple drive failures within a relatively shorter time period than MTBF would seem to suggest.
Fortunately most drive models have failure rates less than 5%. I'm planning on sticking with HGST NAS drives as they seem to mostly hover around 0.5 to 2.5% failure rates pretty reliably.
I am fully in agreement that long rebuild times are becoming more of an issue than URE rates. Either way, ZFS needs to be able to deal with it.
By the way, I don't know if WD Reds are still being recommended much around here but they don't seem to be doing well on Backblaze's report. Link. Failure rates of up to 13%.