nickt
Contributor
- Joined
- Feb 27, 2015
- Messages
- 131
So my parts are in transit for my first FreeNAS build - looking forward to getting it going! I've seen heaps of great advice in these forums, and I plan to follow it, including the use of RAIDZ2 (6 x 3TB drives). I am also using ECC RAM (2x 8 GB) and a good board (I hope: C2750D4i). I will also have a good backup strategy in place.
But I have to say that I am not entirely convinced by the "RAIDZ1 is dead" thing - there are some real world aspects to this that I can't reconcile.
Yes, I absolutely get the basic maths (I'd have a 23.7% chance of recovering from a single drive failure if I used RAIDZ1 assuming 10^-14 URE drives), but there are a whole bunch of assumptions in this calculation...
What I can't quite reconcile is how these numbers play out in real life with regular scrubs. I'll be doing those too, but surely the maths suggest that a significant majority of scrubs will detect (and correct) at least one URE.
Is this what actually happens? Do FreeNAS users find that *most* scrubs experience a URE (that needs to be corrected)? And if so, what does this mean? My working assumption was that if a scrub found an error, it was an indication that I had a drive that was on the way out, so I should replace said drive. But does that mean that I'm likely to be replacing a drive every few weeks (depending on my scrub schedule)?
My guess is that most scrubs do not find an error, so I am left wondering about the maths involved with concluding that RAIDZ1 is dead. (My underlying assumption here is that a scrub and a resilver both require the same intensities of read activity / checksum calculation / checking, and so the processes are inherently similar).
Like I say, I plan to "play by the rules", but I'm just not entirely sure about the evils of RAIDZ1. Assuming I need to keep my system below 80% capacity, RAIDZ2 reduces my 16 TB to an effective capacity of 9.6 TB - RAIDZ1 would be nice...
Interested in your feedback - what is your experience with scrub error rates? Am I making some wrong assumptions in how scrubs and resilvers work?
But I have to say that I am not entirely convinced by the "RAIDZ1 is dead" thing - there are some real world aspects to this that I can't reconcile.
Yes, I absolutely get the basic maths (I'd have a 23.7% chance of recovering from a single drive failure if I used RAIDZ1 assuming 10^-14 URE drives), but there are a whole bunch of assumptions in this calculation...
What I can't quite reconcile is how these numbers play out in real life with regular scrubs. I'll be doing those too, but surely the maths suggest that a significant majority of scrubs will detect (and correct) at least one URE.
Is this what actually happens? Do FreeNAS users find that *most* scrubs experience a URE (that needs to be corrected)? And if so, what does this mean? My working assumption was that if a scrub found an error, it was an indication that I had a drive that was on the way out, so I should replace said drive. But does that mean that I'm likely to be replacing a drive every few weeks (depending on my scrub schedule)?
My guess is that most scrubs do not find an error, so I am left wondering about the maths involved with concluding that RAIDZ1 is dead. (My underlying assumption here is that a scrub and a resilver both require the same intensities of read activity / checksum calculation / checking, and so the processes are inherently similar).
Like I say, I plan to "play by the rules", but I'm just not entirely sure about the evils of RAIDZ1. Assuming I need to keep my system below 80% capacity, RAIDZ2 reduces my 16 TB to an effective capacity of 9.6 TB - RAIDZ1 would be nice...
Interested in your feedback - what is your experience with scrub error rates? Am I making some wrong assumptions in how scrubs and resilvers work?