How to minimize the risk of multiple disk failure

Status
Not open for further replies.

ChrisHolzer

Dabbler
Joined
Apr 6, 2017
Messages
23
Hello again!

I am quickly approaching the point where I will order the hardware for my FreeNAS. :)

I want to go with 10TB NAS drives, however one thing that I read frequently is that using the same model/manufacturer for a raid increases the risk of multiple drives failing at the same time, especially during the rebuild process which puts extra stress on the remaining (old) disks.

So I was wondering what you guys think about this. Does anyone of you use drives from different manufactures to minimize the risk of having multiple drives fail? If so, what are the downsides of doing that?

Thanks again for the help I already got here in the forums! :)
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
There are very few users who actively get different drives for this reason. Although you are correct that there might be some benefits.
However, following a proper burn-in procedure of the box and drives will ensure most early death of drives to be caught prior to commitment of data.

The standard reply is: Get quality drives, burn in, your're fine.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Get quality drives, burn in, enable regular scrubs, run regular SMART tests, keep drives cool, your're fine.
Fixed that for you. And use at least RAIDZ2.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
I always recommend spending less time worrying about disk failures, and more time developing your backup plan. As motivation, I recommend RAIDZ1. RAIDZ2 is for a system you may not be able to service for a month. RAIDZ3 is for a system you may not service for a year. (These are examples, not hard numbers.)

The fear of multiple disk failure is a HW RAID thing. If you have scheduled scrubs, and SMART tests, you're covered. A scrub is just as much work as a resilver. HW RAID would fail during a rebuild because they would never be scrubbed, and then additional problems would be discovered during the rebuild.

As we have seen recently, your data can be lost in a second by screwed up encryption or accidentally deleting a dataset. Multiple disk failures should be the least of your worries. Make sure you are notified as soon as a problem is detected.
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
I do not purchase drives from various manufacturers. The above advice that has already been stated is sound. Backups and regular maintenance (automated) is as good as security as any.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Thanks for the helpful feedback guys!

What do you think about the http://www.seagate.com/gb/en/internal-hard-drives/hdd/ironwolf/ (8TB Model) for a home NAS?
I'm intrigued by the Ironwolf series for sure. Particularly since they are currently a lot cheaper than WD REDS in my area.
The potential downside to IMO is the deviant formatting and reporting of SMART status. Seagate has used a different pattern than WDs that I'm used to. I find the Seagate SMART data to be more difficult to read. It might be a minor detail accentuated by habituation, but none the less one that influences my decision.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
@ChrisHolzer
I personally ended up using 2 x 4TB WD Reds and 2 x 4TB WD Red Pros for my 4 disk RAID-Z2 pool.
And each disk bought separately, (like one Red in retail packaging, one Red bought in bulk packaging).

If I were to do it over, (even though none of the 4 disks has failed or given any indication of failing any
time soon), I would have bought 2 from a different manufacturer.

Please note that in my case, I was looking for 5 years of good, solid, reliable use from my NAS. My prior
NAS lasted 7 years with only a memory upgrade and 2 additional disks.

Last, one thing can help during disk replacement. If you plan on having an extra SATA / SAS disk slot,
(internal or external), you can perform a ZFS disk replacement while the current failing disk is still present.
Basically, ZFS will create a mirror of the failing disk. Any bad blocks encountered, ZFS will use whatever
redundancy is available, (parity or mirror). When done, ZFS will detach the failing disk from the vDev.
Less impact to the overall pool, just the failing disk which obviously we care less about.

Plus, the free disk slot can be used for backups. Especially if it's either hot swap or cold swap, but in a tray.
...
The potential downside to IMO is the deviant formatting
...
Did you mean the Seagate 8TB SMR Archive disks in the above comment?

As far as I know, the Seagate Ironwolf do not use SMR technology.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
If you plan on having an extra SATA / SAS disk slot,
(internal or external), you can perform a ZFS disk replacement while the current failing disk is still present.
It only took nine months to get this into the docs, but it's there (kind of). But see Chris Moore (not Kris Moore)'s comment on the bug, noting that the resilver seems to take much longer this way.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
It only took nine months to get this into the docs, but it's there (kind of). But see Chris Moore (not Kris Moore)'s comment on the bug, noting that the resilver seems to take much longer this way.
Yes, performance using the replace with existing still present, can be slower, (because source is a single disk, instead of the rest of the vDev's disks).

It all depends on what you want to achive. For example, the replace with existing still present can maintain additional redundancy for the data. Except when the failing disk is still finding new errors and taking too long. Then it's time to just get it done and ignore the failing disk.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Yes, performance using the replace with existing still present, can be slower, (because source is a single disk
I don't think this (the bolded part) is correct. I've done this type of disk replacement a few times, and it looked like the system was hitting the whole vdev quite a bit. It's entirely possible, of course, that there was other stuff going on, but even though the outcome is a replacement disk that's identical to the replaced disk (as of the moment the latter is taken offline), I don't think that's how the process works internally. I'm speaking from a user's perspective, of course, having never looked at the code (and unlikely I'd be capable of understanding it anyway).
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Did you mean the Seagate 8TB SMR Archive disks in the above comment?

As far as I know, the Seagate Ironwolf do not use SMR technology.
Correct, Ironwolf do not use SMR.
I referred to the S.M.A.R.T raw value output formatting.
 
Status
Not open for further replies.
Top