"Cheap and deep" vs...not that.

markwill

Dabbler
Joined
Nov 12, 2019
Messages
35
The next focus in my exploration is the choice of hard drive. As I have mentioned elsewhere, I don't have significant storage needs, at least for now. I am likely to go with RAIDZ2, which seems a good balance for my own performance / space / risk sensitivities! I am mindful of the constraints with extending a vdev and upfront planning in that context.

As I research my Black Friday options :), I've set myself a tentative budget of $400 for my hard drives (which, based on history, may well turn into $450 at some point!). Assuming I do go with RAIDZ2, I am trending towards a larger number of small drives (1TB), as opposed to fewer larger drives. My thinking here is that this will help with rebuild times, if necessary, as well as being able to justify the cost of a hot spare (which still needs some research on my part).

I am considering maybe 7-8 1TB drives, as opposed to 4 x 4TB drives.

Is there anything I'm missing with this philosophy?

By the way, to check another conclusion I drew, are there any advantages at all to dividing x disks into separate vdevs, when I have the option to just throw them into a single vdev at the outset?

Sidenote : I was using a RAIDZ storage calculator that stated that RAIDZ2 requires a minimum of five - not four - drives. I am assuming that is just wrong, but if there's something I should be aware of please let me know.
 
Last edited:

subhuman

Contributor
Joined
Nov 21, 2019
Messages
121
markwill said:
I've set myself a tentative budget of $400 for my hard drives ...
...I am considering maybe 7-8 1TB drives, as opposed to 4 x 4TB drives.
Unless prices for you are significantly different from what I'm seeing,
1TB drives: $30/TB

3 TB drives: $16/TB

4 TB drives: $18/TB

You specify "cheap" as a consideration, so just throwing this out there. But 3TB seems to be the "sweet spot" right now.
You can bring it down a little more if you want to include "White Label" drives in there, but:
8 1TB drives: $210 to $240 capacity in RAIDZ2: 5 or 6TB
5 or 6 3TB drives: $250-$300 capacity in RAIDZ2: 9 or 12TB
4 4TB drives: $300 capacity in RAIDZ2: 8TB
Going with the "powers of two" rule, the preferred configurations would be the 6x3TB or 4x4TB RAIDZ2. Both of which cost about the same, and are well under your budget.

My thinking here is that this will help with rebuild times, if necessary, as well as being able to justify the cost of a hot spare (which still needs some research on my part).
Only you can answer if the extra expense for a hot spare is justified.
If you do NOT have physical access to the server, then yes a hot spare is a REALLY good idea.
If you live somewhere really remote where it will takes you days to get a replacement shipped to you, you may want a spare (hot or otherwise).
If the extra cost is worth it for your peace of mind, if you can accept there's a very real chance that you'll never use that spare, that it will sit unused collecting dust until it's old and obsolete, then you may want a spare.

Rebuilding (resilvering) only copies data from used areas of the disks. So yes, resilvering 3 or 4TB drives in the above configurations will take longer than resilvering 1TB drives- assuming the same amount of total data in both cases. However, it will not take 3-4x longer to resilver those larger drives.
Meaning, the 8x1TB vdev we discussed above, at 80% full (the max you should hit) has 4.8TB of data on it. 80% full, 800GB is on each disk, 800GB has to be written to a replacement drive when resilvering.
A 6x3TB vdev, with the same 4.8TB of data on it, has about 1.2TB of data on each drive. That's only 50% more to be written to a replacement drive when resilvering, despite the drive capacities being 3x the size of the 1TB drives.
 

markwill

Dabbler
Joined
Nov 12, 2019
Messages
35
Thank you for the excellent and helpful response, @subhuman.

A few further questions, if I may...
  • Can I assume from the links you included that you think drives sold as for NAS are really just a marketing gimmick (I noticed you didn't filter on that)? I was filtering in NewEgg on Usage = NAS, which of course comes up with the usual Ironwolf, etc drives, which is how I came up with the rough swags on prices I had in mind. It was my understanding that these had specific physical and firmware characteristics that were more suited to constantly running drives, as will be the case with NAS. is that not a real factor?
  • I probably didn't explain the "cheap" thing very well. Within reason, I don't mind spending my dollars where it's justified, so I am not necessarily looking for "cheap". My comment there was just to highlight the decision about more cheapER, smaller drives vs. fewer, more expensive larger drives.
  • However, I'd be interested in your thoughts on what I will need if a drive does go bad. By way of example, let's say I use RAIDZ2 and one drive fails. While I obviously have the redundancy to tolerate that, I would also be looking to replace it soon, since I am then a single drive away from complete loss of my data (which is an unacceptable risk for me, over any length period of time). What factors would play into the new drive I use to the replace the failed one. Does that simply need to be of the same size or larger (I understand I would waste space if larger)? Is there any realistic benefit in replaced with the same make/model as the remaining drives?
  • One thing I read today that was news to me is that a vdev essentially has the performance characteristics of a single drive. Is that correct and, aside from other vdev decisions, is that an argument in favor of more than one vdev, where I have enough drives to accommodate RAIDZ2 on both vdevs?
  • Aside from noise (I assume), what would be the justification in a 7200 rpm drive (over a 5400 rpm drive) when both are rated at 6Gb / sec?
  • In researching this more and also with thanks to your explanation I am probably going avoid a hot spare
Thank you again.
 

markwill

Dabbler
Joined
Nov 12, 2019
Messages
35
PS : If the vdev performance issue is accurate, then a nice balance for me might be 8 x 3TB, divided into 2 vdevs of four drives each. That does assume I can get comfortable with using "non-NAS" drives though.
 

subhuman

Contributor
Joined
Nov 21, 2019
Messages
121
Can I assume from the links you included that you think drives sold as for NAS are really just a marketing gimmick
No. Thye are different, at minimum their firmware has differnt programmed behavior in some situations.
The general concern in behavior difference, is that a stand-alone drive will perform many retries on a sector when it gets a read error, assuming it's all alone and retrieving the data is vital. These retries can take considerable time- even noticable to us slow humans. Where a NAS drive will retry very few times, working on the assumption that even if this drive can't read the sector, the other drives in the array can, and the "missing" drive's data can be rebuilt, so just move on and read the next sector. So a non-NAS drive can really tank overall performance if it has a few sectors that are hard to read.
So let's dig into the guts of SMART a little.
If a sector takes more than "x"retries to be read, the drive flags it to be remapped (this is the SMART value "pending sector count"). "This sector is going bad, let's stop using it." The next time the drive tries to write data to that "bad" sector, it doesn't- it writes it elsewhere, and flags the bad sector as unusable (SMART value "reallocated sector count"). This is all fine and good. In fact, it's great. Except one fatal flaw: every drive I've been able to dig into, will only remap a sector when that sector is to be written to. Meaning, if you have bad sectors on a drive that you write to once, then from then on only read from (like a drive holding media files, for example), the bad sectors never get remapped. What starts as a slight hiccup in xfer rates as the drive keeps hammering the retries gets worse over time, taking longer and longer to read that sector. Because bad sectors don't become good, they become worse with time. Eventually more bad sectors crop up on that drive, and the problem gets several times worse. Something that shouldn't be a problem, that can be solved by simply writing data to the disk, never gets solved.
Enter ZFS and FreeNAS.
A periodic "scrub" is performed on your vdevs. By default, IIRC it's weekly. You didn't turn it off scrubs, right? FreeNAS reads all data, monitoring SMART as it goes, and if it finds pending sectors it forces a write to reallocate the data. The drive relocates the sector, and all is fixed. Problem solved.
Long story short, the biggest problem... isn't a problem here.
I was filtering in NewEgg on Usage = NAS, which of course comes up with the usual Ironwolf, etc drives, which is how I came up with the rough swags on prices I had in mind. It was my understanding that these had specific physical and firmware characteristics that were more suited to constantly running drives, as will be the case with NAS. is that not a real factor?
NAS, CCTV and Enterprise drives are generally all designed for 24/7 usage.
Ok, personally I use mostly HGST drives (which are up on those pages in the price ranges quoted) and some WD purples. Yes, WD Purples. Performance and firmware-wise they're pretty much Greens, but with a wider operating temperature range.
But my small numbers of drives are statistically insignificant. Backblaze has a lot to say on the topic:
Should we switch to enterprise drives?
Assuming we continue to see a failure rate of 15% on these drives, would it make sense to switch to “enterprise” drives instead?
There are two answers to this question:
  1. Today on Amazon, a Seagate 3 TB “enterprise” drive costs $235 versus a Seagate 3 TB “desktop” drive costs $102. Most of the drives we get have a 3-year warranty, making failures a non-issue from a cost perspective for that period. However, even if there were no warranty, a 15% annual failure rate on the consumer “desktop” drive and a 0% failure rate on the “enterprise” drive, the breakeven would be 10 years, which is longer than we expect to even run the drives for.
  2. The assumption that “enterprise” drives would work better than “consumer” drives has not been true in our tests. I analyzed both of these types of drives in our system and found that their failure rates in our environment were very similar — with the “consumer” drives actually being slightly more reliable.
From poking around the forums here, there's quite a few people using consumer drives- even cheap ones- in their FreeNAS setup. So I'll leave it to you to decide. I will say this, I mentioned some of mine are HGST, and if you look at different Backblaze quarterly drive reports the HGST drives are consistently at the lowest failure rates. Some people don't like 7200 RPM drives, which they are, and some say they're noisier. Can't say I've ever been bothered by the latter.

I probably didn't explain the "cheap" thing very well. Within reason, I don't mind spending my dollars where it's justified, so I am not necessarily looking for "cheap". My comment there was just to highlight the decision about more cheapER, smaller drives vs. fewer, more expensive larger drives.
Right, but you mentioned 8x1TB drives. From the first page I linked previously, these are the second-cheapest and are server drives designed for 24/7 use with a 2 year warranty:

...they even come with free Dell drive sleds, just in time to give them to people for the upcoming Holidays!

For what you said you were considering that's $240 for a RAIDZ2 capacity of 6TB.
From my second link in my previous post:

6 of them for $300, still well under your budget of $400 you were expecting to spend, and in a RAIDZ2 will have a capacity of 12TB.
I also get that you said you don't need a lot of space... right now. But this is something that should last several years at least- your needs are really unlikely to shrink and much more likely to grow, and you should not fill any pool to greater than 80% of it's max capacity. That 8x1TB drive pool would only have a usable capacity of 4.8TB. That's really not much.

Gonna address this next part out-of-order:
By way of example, let's say I use RAIDZ2 and one drive fails. While I obviously have the redundancy to tolerate that, I would also be looking to replace it soon, since I am then a single drive away from complete loss of my data
No. If one drive fails in a RAIDZ2, two more drives have to fail before you lose data. Two drives can fail, you still get your data back.
However, I'd be interested in your thoughts on what I will need if a drive does go bad.
Ok, I kinda touched on this in my previous post.
If you're renting space in a remote location for your server and do not have easy physical access to the server, then yes a hot spare is pretty much necessary. You can bring it online and resilver through the GUI.
If you live somewhere really really remote in the middle-of-nowhere and it will takes days for a Sherpa to hike to you with a replacement drive, you should keep a spare on hand.
If you, like most people, live somewhere that gets regular mail/UPS/whatever delivery, you can order a replacement drive and have it arrive in a few days or even next day. If you live within driving distance of a Best Buy or similar, you can drive out and pick one up. Bottom line is, you have options. Remember, you can replace a drive with any drive that is as big or, or larger than, the drive it's replacing.
If you need to replace a drive in four years, a 3TB replacement will cost less then than it would today. Or you get a larger replacement drive than the one you would buy today, but for the same price. Statistically, most drives either fail right away, or years down the road. There's topics about burn-in testing drives, read 'em.. Beat the hell outta them when they're new, before you commit any vital info to them. If your drives survive the tests, most likely they will last you several more years
What factors would play into the new drive I use to the replace the failed one. Does that simply need to be of the same size or larger (I understand I would waste space if larger)? Is there any realistic benefit in replaced with the same make/model as the remaining drives?
Depends on when they fail, and what drive prices are like when it happens. If it's under warranty, RMA it. Maybe buy one to slap in while the RMA drive is in the mail. Your call.
If it happens 3-4 years from now, I'd price larger drives. I bet 6 or 8TB drives are dirt cheap by then. Get on a replacement schedule, maybe three years from now start buying one new drive per month, replacing the old ones one at a time. Once you replace them all, your pool capacity suddenly will increase.
The steps you take to remove a drive from service, add a replacement drive, and resilver that drive are in the FreeNAS instructions.


One thing I read today that was news to me is that a vdev essentially has the performance characteristics of a single drive. Is that correct and, aside from other vdev decisions, is that an argument in favor of more than one vdev, where I have enough drives to accommodate RAIDZ2 on both vdevs?
Depends on the type of operation. IOPS? Random read? Random write? Sequential read? Sequential write?
It's probably irrelevant. 99% of the time, the network is the bottleneck, not the drives.

If the vdev performance issue is accurate, then a nice balance for me might be 8 x 3TB, divided into 2 vdevs of four drives each. That does assume I can get comfortable with using "non-NAS" drives though.
2 vdevs doesn't gain you any redundancy/reliability. If any vdev in a zpool is unreadable, the entire zpool is unreadable.
8x3TB drives in a single RAIDZ2 vdev= if 3 drives fail simultaneously everything is lost
2 vdevs of each 4x3TB RAIDZ2 drives in a zpool = if 3 drives in the same vdev fail simulatneously everything is lost
You *might* get "lucky" and have 2 drives fail in each of the vdevs, in which case your data is still recoverable. But in either scenario, 3 dead drives can cost you everything.
If you're really worried about it, how about 8x3TB drives in a RAIDZ3? Then, four drives have to fail before you're screwed. And you still have a max pool capacity of 15TB.

Aside from noise (I assume), what would be the justification in a 7200 rpm drive (over a 5400 rpm drive) when both are rated at 6Gb / sec?
You won't hear the drive spinning. Maybe under initial spin-up when it's just starting. You may hear the actuator (that positions the heads) but that can apply to any speed drive. Just be glad HDD actuators aren't stepper motors anymore.
Ok, 6GB/Sec is the *interface* max throughput. No HDD can saturate it. No spinning HDD can maintain saturation on even a SATA-II/3Gb link- they may initially saturate it while filling or emptying their cache, but they can't maintain it. However, a 7200 RPM drive will get closer. 7200 RPM drives do use more power. Not a lot once they're spun up and working (1-2w more typically), but quite a but more during the first few seconds while they're spinning up. If you're a datacenter with 20k drives in service, it's something to consider. For the rest of us, it's negligible. 1-2 watts/drive * 8 drives... doesn't matter.
 

Jessep

Patron
Joined
Aug 19, 2018
Messages
379
If you pay for your own electricity TCO says use newer larger drives and newer lower power components. I.e. Newer and don't over buy.

8X1TB = ~4.4TB usable
4x4TB = ~6.0TB usable
https://wintelguy.com/zfs-calc.pl

What are your actual space requirements?
What are your actual growth requirements?
What are you actual performance requirements?

WAG would be go with 6x4TB RaidZ2 and be done with it. Ensure you get a good HBA and you should be fine. i.e. something like
https://www.ebay.com/itm/Genuine-LSI-6Gbps-SAS-HBA-LSI-9201-8i-9211-8i-P20-IT-Mode-ZFS-FreeNAS-unRAID/162958581156?hash=item25f1169da4:g:7sYAAOSwjMtcT8T8

DO NOT buy SMR drives.
 

markwill

Dabbler
Joined
Nov 12, 2019
Messages
35
OK, @Jessep, let me tell you about my little secret!

My storage needs are laughable small compared to...well, probably pretty much everyone here. I use around 400Gb right now, all on OneDrive at no extra cost to me (Office365 subscription). So why the heck am I looking at building a NAS, I hear you say....

One reason is that I have started creating / editing videos for webinars for my small business over the last year or so. These still add up to very little space, but there are a number of trends that will push this up, including increasing the frequency of these, possibly moving to 4k one day and more.

But, separate from that, I just need to get more organized for my business and I've decided I want to build a NAS server (I have built a PC in the past - seven years ago - and I am doing so again with Black Friday coming). Once I have all this figured out, up and running, secured, backed up to the cloud and so on, then my NAS server will become more central to all I do.

I have been around long enough to know that "That's more space / speed, etc than I need" is often a blind statement. But, even with that qualifier, my usage patterns suggest that having 4TB of usable storage will do me just fine for the foreseeable future. But that small amount of data will be increasingly important, of course.

So....
  • What are my actual space requirements? Minimal now
  • What are my actual growth requirements? Accelerating, but still small in the big scheme of things
  • What are my actual performance requirements. To be honest, I don't NEED much performance. But I enjoy and value good performance, and don't mind paying a few extra dollars for that (which is why I am going with a 10Gb card) *
* I should add that I also want to built something that will serve me well for years and the 10Gb premium (cost wise) is no issue at all for me to take on board now. I am not terribly price-sensitive with this project.

My decisions so far include going with RAIDZ2, which implies at least 4 drives. The 6 x 4Tb you suggest seems reasonable, but that will give me a ton more than I need, even for the foreseeable future. So, 4 x 4Tb might also be viable, given my space needs now and in the future. I want really solid/respected drives and it does seem that I'd be looking at around $100 per drive (Ironwolf, etc). I haven't decided on this though.

Anyway, it's all good fun. While I have been in IT for decades, I have never played in this specific area before. I always enjoy applying "IT discipline" to home and small business projects, though my experience is almost entirely in terms of software. So I learn new things with every post and every reply here.

Thank you.
 

markwill

Dabbler
Joined
Nov 12, 2019
Messages
35
@subhuman, your response is awesome! You have given me so much to think about, in a good way. There are a number of comments in your response that have taken my research in a slightly different direction and the learning continues :)

Thank you!
 

Jessep

Patron
Joined
Aug 19, 2018
Messages
379
I would suggest you go in the other direction then, leave your data in OneDrive, and back up OneDrive rather than create a NAS that you can backup to OneDrive.

OneDrive defaults to 1TB as I recall (that can be increased) and you can get Veeam O365 backup to back it up locally. $1.42 a month.
https://www.veeam.com/backup-microsoft-office-365.html

Actually thinking about it OneDrive would be sync'd locally to your OneDrive folder, that wont protect against user events (you delete a file), it will protect against O365 outages and data loss.
 

markwill

Dabbler
Joined
Nov 12, 2019
Messages
35
Yep - I know. I've been thinking about this for a while :)

There are a number of reasons I prefer a NAS (with a backup to the cloud, likely OneDrive initially but that could change). These include independence from any one cloud provider (can switch at any time), the possibility of adding Plex later (low priority for now) and more.

Although a minor factor, it's quite common for me to have OneDrive-based files to be downloaded to my laptop on demand, since I don't / can't store everything locally. When at home I'd have everything at hand, from the NAS server.

Finally, while I am not going to pretend it's a massive concern for me, I eventually want to reach a point where nothing sensitive is stored on someone else's computer, unless its encrypted both in-flight and at rest. I don't have any particular timeline on reaching that point, but it's the direction I'd like to go.

Obviously the approach you suggest is perfectly viable, at least for now, but I'm just going down this FreeNAS path now and enjoying the options it opens up for me (and the family too, for some other scenarios).
 
Top