melp
Explorer
- Joined
- Apr 4, 2014
- Messages
- 55
I'm currently planning for a 24-drive FreeNAS build and wanted to compare some different vdev configurations. I did some basic math to come up with relative probabilities of data loss and wanted to share these for comments and discussions.
A couple of notes before I get started: my build will be used as a media server, so performance isn't my highest priority and I won't be sticking to the (2^n)+x rule-of-thumb that cyberjock outlined on slide 33 of his excellent guide. I also don't account for the decrease of vdev reliability as the number of drives within that vdev increases; this is something I don't know very much about and was hoping to discuss here.
In addition to that, keep the following in mind (from DrKK, who helped a lot with these calculations):
With that in mind, the two configurations that I'm considering are as follows:
Let's assume that all 24 of your drives have a certain probability to fail during a given unit of time, and that you would use the same exact drives for either configuration. This would result in a certain rate of failure (similar to the ones that drive manufacturers pull out of their ass and call "mean time between failures" or "MTBF"). Because we're comparing the relative reliability (as opposed to the absolute reliability, i.e., a period of time), the exact rate doesn't matter as long as we're consistent with the two cases. If we call the rate at which our drives fail r, we can call the probability of failure of a single drive f = 1/r.
A few quick points for those who haven't studied basic probability before:
For configuration 1, we'll have p = f (the probability of a single drive failure), n = 8, k = 3:
This has 3 parts to it:
Taken all together, it's saying:
Notice the 3 sets of [] brackets. The first set is the probability that 2 drives in our vdev fail, the second set is the probability that 1 drive fails, and the last is the probability that none of the drives fail. Summing all these up is saying "the probability that two drives fail -OR- one drive fails -OR- zero drives fail".
Now we need to account for the fact that we have 3 vdevs, and that if at least one of them fails (2 could fail, or even all 3), we lose the whole zpool, we'll need another set of binomial distributions, this time using p = A, n = 3, and k = 1, 2, and 3. An easier option is to use the same trick as above:
We calculated the probability of a single vdev being alive in the previous step, and we'll use that here to calculate C1, the probability of losing our whole zpool in configuration 1:
At this point, we want to compare them. Lets look at a graph of both C1 and C2 Wolfram Alpha: https://www.wolframalpha.com/input/?i=graph+%281-%28%5B28*f%5E2*%281-f%29%5E6%5D%2B%5B8*f*%281-f%29%5E7%5D%2B%5B%281-f%29%5E8%5D%29%5E3%29+and+%281-%28%5B220*f%5E3*%281-f%29%5E9%5D%2B%5B66*f%5E2*%281-f%29%5E10%5D%2B%5B12*f*%281-f%29%5E11%5D%2B%5B%281-f%29%5E12%5D%29%5E2%29
Obviously, this graph is showing the probability of a failure (as opposed to the rate of failure), so smaller numbers are better. Based on this, it appears that the probability of losing our whole zpool is always lower when using configuration 2.
You can easily apply the process I've outlined here to any other configurations to compare them, but take the results with a grain of salt. Again, I'm not accounting for the apparent decrease of vdev reliability as the number of drives within that vdev increases, and the possibility that drive failures may not be entirely independent.
Footnote: This has been an iterative process, and you can see some of the steps below. DrKK helped out a lot with getting everything straight and simplified. If you have any questions, comments, corrections, please let me know.
A couple of notes before I get started: my build will be used as a media server, so performance isn't my highest priority and I won't be sticking to the (2^n)+x rule-of-thumb that cyberjock outlined on slide 33 of his excellent guide. I also don't account for the decrease of vdev reliability as the number of drives within that vdev increases; this is something I don't know very much about and was hoping to discuss here.
In addition to that, keep the following in mind (from DrKK, who helped a lot with these calculations):
Let's start with the assumption (try not to laugh, it's serious) that one drive failing in a vdev, or in any part of the zpool, is completely independent of another drive's failure. i.e., the fact that drive #3 just failed will have no bearing on when or if drive #7 fails. Now, you're thinking "obviously"!
But it's not so obvious. Two reasons:
Both of the above properties are real, and non-zero---but we'll assume "minor". If we tried to factor them in, we really WOULD need a statistician. Your mileage may vary.
- A lot of people tend to populate with drives from the same production run. i.e., made at around the same time (or even, the same exact lot) in the factory. Obviously, QC dynamics indicate that the drives in that particular lot could share characteristics, in terms of failures rates, and times to failure. In other words, the very fact that one drive from the same lot failed might indicate that the other drives (if any) from the lot might be more prone to fail. I hope this is clear. We are going to *NOT* factor in this kind of intra-dependence.
- There is actually anecdotal evidence that drives, even if not from the same run, somehow may tend to fail in groups. Also, when one drive has failed in ZFS, the act of resilvering or whatever is itself a high-risk, high-intensity operation which puts the drive under considerably more stress than its default run condition. So, the *ACT* of trying to resilver the pool may actually, ironically, increase the likelihood that the pool dies. We are *NOT* going to factor this in either.
With that in mind, the two configurations that I'm considering are as follows:
- 3 vdevs, 8 drives per vdev, each in RAIDZ2
- 2 vdevs, 12 drives per vdev, each in RAIDZ3
Let's assume that all 24 of your drives have a certain probability to fail during a given unit of time, and that you would use the same exact drives for either configuration. This would result in a certain rate of failure (similar to the ones that drive manufacturers pull out of their ass and call "mean time between failures" or "MTBF"). Because we're comparing the relative reliability (as opposed to the absolute reliability, i.e., a period of time), the exact rate doesn't matter as long as we're consistent with the two cases. If we call the rate at which our drives fail r, we can call the probability of failure of a single drive f = 1/r.
A few quick points for those who haven't studied basic probability before:
- Multiplication in probability is like an AND operator; if you multiply the probability of event X occurring with the probability of event Y occurring, you get the probability that event X AND event Y will occur.
- Addition in probability is like an OR operator; if you add the probability of event X occurring and the probability of event Y occurring, you get the probability that event X OR event Y will occur.
- 1 - probability(event X occurring) = probability(event X NOT occurring)
f(k;n,p) = (n choose k) * p^n * (1-p)^(n-k)
where
(n choose k) = n!/(k!(n-k)!)
For configuration 1, we'll have p = f (the probability of a single drive failure), n = 8, k = 3:
8 choose 3 * f^3 * (1-f)^5
-> 56 * f^3 * (1-f)^5
This has 3 parts to it:
i) (8 choose 3) is saying "how many ways can I have 3 failures in 8 drives?" Using the binomial coefficient, we determine there are 56.
ii) f^3 is the probability of 3 drive failures
iii) (1-f)^5 is the probability that the other 5 drives don't fail.
Taken all together, it's saying:
The probability that...
...drives 1, 2, and 3 fail, and that 4, 5, 6, 7, and 8 don't fail, -OR-
...drives 1, 2, and 4 fail, and that 3, 5, 6, 7, and 8 don't fail, -OR-
...drives 1, 2, and 5 fail, and that 3, 4, 6, 7, and 8 don't fail, -OR-...
...and so on, 56 times, once for each possible combination of failures. Again, all of this is the probability that we'll lose 3 drives on one vdev. However, this alone doesn't fully account for the probability that we'll lose the vdev, since we can lose it by having 4 drives fail, or 5, 6, 7, or even all 8 drives. To account for these, we'd have to add 5 more binomial distributions, with n=8 and k=4 ... 8. With all these summed it, we'd have the probability that 3 or more drives in a vdev failed. That's a lot of terms. Another option that will be simpler is to express makes use of the fact that:probability(3 or more drives failing) = 1 - probability(2 or fewer drives failing)
Because of a similar trick you'll see in the next step, we'll actually use probability(2 or fewer drives failing), i.e., the probability that the vdev is still alive (the same equation as the probability that it's dead, but without the (1 - ...) part in front). We'll still use several binomial distributions (3 of them, to be exact, as opposed to 6 with the other way) with n=8 and k=2, 1, 0, and we'll sum them all up. This is what it'll look like (we'll call the whole thing A):
A = [(8 choose 2) * f^2 * (1-f)^6] + [(8 choose 1) * f^1 * (1-f)^7] + [(8 choose 0) * f^0 * (1-f)^8]
-> A = [28 * f^2 * (1-f)^6] + [8 * f * (1-f)^7] + [1 * 1 * (1-f)^8]
-> A = [28 * f^2 * (1-f)^6] + [8 * f * (1-f)^7] + [(1-f)^8]
Notice the 3 sets of [] brackets. The first set is the probability that 2 drives in our vdev fail, the second set is the probability that 1 drive fails, and the last is the probability that none of the drives fail. Summing all these up is saying "the probability that two drives fail -OR- one drive fails -OR- zero drives fail".
Now we need to account for the fact that we have 3 vdevs, and that if at least one of them fails (2 could fail, or even all 3), we lose the whole zpool, we'll need another set of binomial distributions, this time using p = A, n = 3, and k = 1, 2, and 3. An easier option is to use the same trick as above:
probability(at least one vdev fails) = 1 - probability(all 3 vdevs are alive)
We calculated the probability of a single vdev being alive in the previous step, and we'll use that here to calculate C1, the probability of losing our whole zpool in configuration 1:
C1 = 1 - A^3
-> C1 = 1 - ([28 * f^2 * (1-f)^6] + [8 * f * (1-f)^7] + [(1-f)^8])^3
To reiterate, A is the probability that one of our vdevs is healthy, so A^3 is the probability that vdev1 AND vdev2 AND vdev3 are healthy, and 1 - A^3, is the opposite of that, i.e., fewer than 1 vdevs are healthy (and our whole zpool is lost).
Now lets look at configuration 2. In this config, we have 2 vdevs and any 4 (or more) drives in the same vdev must fail for us to have data loss, but a loss of either vdev will result in a total loss of the zpool. We'll proceed in the same way as config 1, using the same trick to compute the probability that one vdev is alive, but with p = f (same as before) n = 12, k = 3, 2, 1, and 0:
B = [(12 choose 3) * f^3 * (1-f)^9] + [(12 choose 2) * f^2 * (1-f)^10] + [(12 choose 1) * f^1 * (1-f)^11] + [(12 choose 0) * f^0 * (1-f)^12]
-> B = [220 * f^3 * (1-f)^9] + [66 * f^2 * (1-f)^10] + [12 * f * (1-f)^11] + [1 * 1 * (1-f)^12]
-> B = [220 * f^3 * (1-f)^9] + [66 * f^2 * (1-f)^10] + [12 * f * (1-f)^11] + [(1-f)^12]
Agian, this is the probability that one of our 12-drive vdevs is alive. As above, we'll use a second binomial distribution to determine the probability that at least two vdevs fail by computing 1 - the probability that both are alive, and we'll call this C2:C2 = 1 - B^2
-> C2 = 1 - ([220 * f^3 * (1-f)^9] + [66 * f^2 * (1-f)^10] + [12 * f * (1-f)^11] + [(1-f)^12])^2
At this point, we want to compare them. Lets look at a graph of both C1 and C2 Wolfram Alpha: https://www.wolframalpha.com/input/?i=graph+%281-%28%5B28*f%5E2*%281-f%29%5E6%5D%2B%5B8*f*%281-f%29%5E7%5D%2B%5B%281-f%29%5E8%5D%29%5E3%29+and+%281-%28%5B220*f%5E3*%281-f%29%5E9%5D%2B%5B66*f%5E2*%281-f%29%5E10%5D%2B%5B12*f*%281-f%29%5E11%5D%2B%5B%281-f%29%5E12%5D%29%5E2%29
Here's a close-up of the part we care about: http://i.imgur.com/rVxLqjq.png (blue = config 1, red = config 2)Obviously, this graph is showing the probability of a failure (as opposed to the rate of failure), so smaller numbers are better. Based on this, it appears that the probability of losing our whole zpool is always lower when using configuration 2.
You can easily apply the process I've outlined here to any other configurations to compare them, but take the results with a grain of salt. Again, I'm not accounting for the apparent decrease of vdev reliability as the number of drives within that vdev increases, and the possibility that drive failures may not be entirely independent.
Footnote: This has been an iterative process, and you can see some of the steps below. DrKK helped out a lot with getting everything straight and simplified. If you have any questions, comments, corrections, please let me know.