Help - New Build, New HDD, SMART Errors

Status
Not open for further replies.

MichaelGMorgan

Dabbler
Joined
Jun 7, 2017
Messages
13
Hi,

Quick backstory - I built a FreeNAS system using a cheap RAID card with 4 SATA ports along with 4 motherboard SATA ports with the end result being an 8 drive system, with one for the OS (120GB SSD). All 7 HDDs were Toshiba P300 3TB.

Worked solid for 6 months then sudden multiple HDD failures. All sorts of SMART errors - lots of issues.
I sent back a bunch of drives and bought a proper HBA - An LSI M1015. I flashed it into IT mode and now I've just built my system.

I now have 3 of my original HDDs, none of which were showing any SMART errors plus a brand new Toshiba P300 3TB drive straight out of the packaging.
I've installed FreeNAS onto a USB and haven't created any volumes yet.

I've done short SMART tests on all 4 drives, all of which are connected via my LSI HBA card.
3 drives are showing SMART errors - specifically "seek error rate" and "spin retry count", both of which are in the tens of thousands.
The other drive is showing a SMART error which I know is important - it's got a value of 7 for "reallocated sector count".

Out of the 4 drives above, my brand new drive is one of the ones with a very high "spin retry count". The drive has been in the system for around 20 minutes or so.
I've done multiple short tests - haven't run any long tests yet.

What should I do? If these SMART errors are correct then in total it means 8 HDDs have failed on me in the past 8 months. They were delivered across three deliveries so unlikely they were all dropped etc.

I don't know what to do? I could send them all back and either get a replacement or buy an alternative drive (a NAS specific drive maybe) but I then don't want to end up with multiple failing drives again if it's something else.

Any advice would be much appreciated!

Thanks
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Hi,

Quick backstory - I built a FreeNAS system using a cheap RAID card with 4 SATA ports along with 4 motherboard SATA ports with the end result being an 8 drive system, with one for the OS (120GB SSD). All 7 HDDs were Toshiba P300 3TB.

Worked solid for 6 months then sudden multiple HDD failures. All sorts of SMART errors - lots of issues.
I sent back a bunch of drives and bought a proper HBA - An LSI M1015. I flashed it into IT mode and now I've just built my system.

I now have 3 of my original HDDs, none of which were showing any SMART errors plus a brand new Toshiba P300 3TB drive straight out of the packaging.
I've installed FreeNAS onto a USB and haven't created any volumes yet.

I've done short SMART tests on all 4 drives, all of which are connected via my LSI HBA card.
3 drives are showing SMART errors - specifically "seek error rate" and "spin retry count", both of which are in the tens of thousands.
The other drive is showing a SMART error which I know is important - it's got a value of 7 for "reallocated sector count".

Out of the 4 drives above, my brand new drive is one of the ones with a very high "spin retry count". The drive has been in the system for around 20 minutes or so.
I've done multiple short tests - haven't run any long tests yet.

What should I do? If these SMART errors are correct then in total it means 8 HDDs have failed on me in the past 8 months. They were delivered across three deliveries so unlikely they were all dropped etc.

I don't know what to do? I could send them all back and either get a replacement or buy an alternative drive (a NAS specific drive maybe) but I then don't want to end up with multiple failing drives again if it's something else.

Any advice would be much appreciated!

Thanks
Toshiba hard drives are relatively new and are more-or-less 'Unknown Territory' when it comes to judging their reliability. Your own experience with them doesn't bode well.

The drive with re-allocated sectors should be eligible for replacement under warranty.

In any case, I recommend exchanging all of these drives for HGST Deskstar NAS or Western Digital Red models, both of which are known to work well in FreeNAS systems.

Whatever drives you end up using, burn them in thoroughly (see my "Github repository for FreeNAS scripts, including disk burnin" thread in the Resources section) before you put them into service.

Good luck!
 

MichaelGMorgan

Dabbler
Joined
Jun 7, 2017
Messages
13
I've had a look around and can see that Seagate IronWolf 4TB drives are a bit cheaper than the WD 4TB Red drives.
Do they have much use/reliability history with FreeNAS?

I'd prefer to not have to replace all my drives due to cost but if I have to then so be it. I can send them all back, get a refund and buy the replacement drives etc.

I've just started a long SMART test on all my drives so I should hopefully know more tomorrow - it's going to take around 360 minutes apparently.
I'm about 10 minutes in and I can already hear a clicking noise every 30 seconds or so, so I'm guessing one or more of the drives is having some real difficulty.

In the meantime, if anyone has any input or thoughts on what could be causing so many failures I'd appreciate it.
I assume there shouldn't be any issues with my LSI M1015 incorrectly reporting SMART error values?

Thanks
 
Last edited by a moderator:

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
I've had a look around and can see that Seagate IronWolf 4TB drives are a bit cheaper than the WD 4TB Red drives.
Do they have much use/reliability history with FreeNas?
You can search the forum and find other member's opinions on Seagate and other manufacturer's drives. I do not recommend Seagate drives, but that's just my personal opinion; others swear by them.
I'd prefer to not have to replace all my drives due to cost but if I have to then so be it. I can send them all back, get a refund and buy the replacement drives etc.

I've just started a long SMART test on all my drives so I should hopefully know more tomorrow - it's going to take around 360 minutes apparently.
I'm about 10 minutes in and I can already hear a clicking noise every 30 seconds or so, so I'm guessing one or more of the drives is having some real difficulty.
Clicking noise? That's not good... record the serial number of the drive that's clicking so you can easily identify it later. That alone may be grounds enough for a warranty replacement.
In the meantime, if anyone has any input or thoughts on what could be causing so many failures I'd appreciate it.
I assume there shouldn't be any issues with my LSI M1015 incorrectly reporting SMART error values?

Thanks
You might want to double-check your cable connections, and make sure that you're using quality cables.
 

NZ_JJ

Dabbler
Joined
May 25, 2017
Messages
28
I've had a look around and can see that Seagate IronWolf 4TB drives are a bit cheaper than the WD 4TB Red drives.
Do they have much use/reliability history with FreeNas?

IronWolf vs WD Reds vs HGST DeskStar....

I am a WD fan, for serveral years I've had very good reliability out of various disks of theirs, on the other hand, I only switched to them because of a bad lot of Seagates I was using.
Up until then I had been Seagate all the way.

I believe that Seagate have improved their reliability markedly in recent times, I've just not had the need / incentive to switch again.
That said, I just purchased a 4TB IronWolf yesterday, as my local retailer was out of stock on the Reds.
Burn-in underway, all good so far.

Overall, now days, I believe it's a matter of personal preference.
Make you you burn-in fully including badblocks and set up regular SMART testing.
One thing to note is the RPM, the HGSTs 7200, IronWolfs 5900 / 7200 vs the Reds 5400
Higher = slightly higher operating temp


In the meantime, if anyone has any input or thoughts on what could be causing so many failures I'd appreciate it.
I assume there shouldn't be any issues with my LSI M1015 incorrectly reporting SMART error values?

The M1015 is one of the forum staples, once flashed correctly you should have years of trouble free use.
Could be power related, surge/brownout - what PSU are you using? Do you have an UPS? If the drives received over/under voltage it could explain damage to multiple drives.
Also - what use does the NAS get - powered on 24/7 or switched off and on often?

Were the failures isolated to the M/B or the Sata card - if all drives from one controller went - the chipset could have failed.
 

MichaelGMorgan

Dabbler
Joined
Jun 7, 2017
Messages
13
I'll see what the long SMART test shows and then maybe buy a few IronWolf drives. The problem is that I actually need 8 of them so it's quite costly.

The HDDs originally failed across both the RAID card and the motherboard, starting with the 4 on the RAID card and then I started having problems with the mobo drives soon after.
It was connected to an APC UPS so no issues there. It was a 600W PSU - certainly not high end but not cheapest option either.

When you say burn-in - what exactly is that doing to the drives? I'm guessing it's a drawn out stress test to make sure they won't fail?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
what exactly is that doing to the drives? I'm guessing it's a drawn out stress test to make sure they won't fail?
I wouldn't say it's so much a "stress test" as a thorough test. The normal burn-in procedure consists of a long SMART self-test followed by a run of badblocks; the latter runs (by default) four passes of writing a certain pattern to every block on the disk, then reading every block to confirm that it's correct. The procedure is written up in several places, including @UncleFester's guide (link in my sig).
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
If these SMART errors are correct then in total it means 8 HDDs have failed on me in the past 8 months.
It's possible, but it seems more likely that something else is wrong with the system. Time to post full system specs and some smartctl output.
 

MichaelGMorgan

Dabbler
Joined
Jun 7, 2017
Messages
13
Okay, so the long SMART test has completed. I took note of both the "seek error rate" and "seek retry count" values before and after the long test.
All drives came back with 0 for reallocated sectors.

All drives now show 0 for "seek error rate". Two of the drives still have very high "seek retry count" values, one at 196608 and the other at 65537 - these are exact the same as prior to the long test.

I think I'll take the plunge and order a bunch of IronWolf drives and swap out the ones with SMART errors. I'll keep a few on hand and continue to swap them out as soon as a SMART error is discovered.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Two of the drives still have very high "seek retry count" values, one at 196608 and the other at 65537 - these are exact the same as prior to the long test.
Those don't necessarily indicate problems with the drives. There's very little standardization across brands for most SMART attributes.
 

NASnewbi916

Dabbler
Joined
Apr 6, 2017
Messages
31
I personally experienced issues with raid cards. I just decided I buy sata cards non raid


Sent from my iPhone using Tapatalk
 
Status
Not open for further replies.
Top