SMART results - are these drives bad?

Status
Not open for further replies.

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I bought six "white label" 6 TB drives from eBay a few weeks ago--at $160 each, I couldn't pass them up. Now I'm starting to think I should have. Two of them started showing hundreds of bad sectors (and one of those dropped offline completely) within the first 24 hours of testing, so I was able to RMA them with the vendor (they did pay return shipping, which was nice, but didn't do an advance exchange). While waiting for those replacements, two more drives started showing questionable results. Here's one:

Code:
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x002f  200  200  051  Pre-fail  Always  -  0
  3 Spin_Up_Time  0x0027  253  253  021  Pre-fail  Always  -  5775
  4 Start_Stop_Count  0x0032  100  100  000  Old_age  Always  -  7
  5 Reallocated_Sector_Ct  0x0033  198  198  140  Pre-fail  Always  -  71
  7 Seek_Error_Rate  0x002f  200  200  051  Pre-fail  Always  -  0
  9 Power_On_Hours  0x0032  100  100  000  Old_age  Always  -  508
 10 Spin_Retry_Count  0x0033  100  253  051  Pre-fail  Always  -  0
 11 Calibration_Retry_Count 0x0032  100  253  051  Old_age  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  7
184 End-to-End_Error  0x0033  100  100  097  Pre-fail  Always  -  0
187 Reported_Uncorrect  0x0032  100  100  000  Old_age  Always  -  0
188 Command_Timeout  0x0032  100  100  000  Old_age  Always  -  0
190 Airflow_Temperature_Cel 0x0022  060  054  000  Old_age  Always  -  40
192 Power-Off_Retract_Count 0x0032  200  200  000  Old_age  Always  -  6
193 Load_Cycle_Count  0x0032  200  200  000  Old_age  Always  -  0
194 Temperature_Celsius  0x0022  112  106  000  Old_age  Always  -  40
195 Hardware_ECC_Recovered  0x0036  200  200  000  Old_age  Always  -  0
196 Reallocated_Event_Count 0x0032  196  196  000  Old_age  Always  -  4
197 Current_Pending_Sector  0x0032  200  200  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0030  100  253  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x0032  200  200  000  Old_age  Always  -  0
200 Multi_Zone_Error_Rate  0x0009  200  200  051  Pre-fail  Offline  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed without error  00%  484  -
# 2  Extended offline  Completed without error  00%  310  -
# 3  Extended offline  Completed without error  00%  187  -
# 4  Short offline  Completed without error  00%  0  -


...and here's the other:
Code:
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x002f  200  200  051  Pre-fail  Always  -  0
  3 Spin_Up_Time  0x0027  253  253  021  Pre-fail  Always  -  5833
  4 Start_Stop_Count  0x0032  100  100  000  Old_age  Always  -  7
  5 Reallocated_Sector_Ct  0x0033  199  199  140  Pre-fail  Always  -  61
  7 Seek_Error_Rate  0x002f  200  200  051  Pre-fail  Always  -  0
  9 Power_On_Hours  0x0032  100  100  000  Old_age  Always  -  508
 10 Spin_Retry_Count  0x0033  100  253  051  Pre-fail  Always  -  0
 11 Calibration_Retry_Count 0x0032  100  253  051  Old_age  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  7
184 End-to-End_Error  0x0033  100  100  097  Pre-fail  Always  -  0
187 Reported_Uncorrect  0x0032  100  100  000  Old_age  Always  -  0
188 Command_Timeout  0x0032  100  100  000  Old_age  Always  -  0
190 Airflow_Temperature_Cel 0x0022  061  055  000  Old_age  Always  -  39
192 Power-Off_Retract_Count 0x0032  200  200  000  Old_age  Always  -  6
193 Load_Cycle_Count  0x0032  200  200  000  Old_age  Always  -  0
194 Temperature_Celsius  0x0022  113  107  000  Old_age  Always  -  39
195 Hardware_ECC_Recovered  0x0036  200  200  000  Old_age  Always  -  0
196 Reallocated_Event_Count 0x0032  196  196  000  Old_age  Always  -  4
197 Current_Pending_Sector  0x0032  200  200  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0030  100  253  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x0032  200  200  000  Old_age  Always  -  0
200 Multi_Zone_Error_Rate  0x0009  200  200  051  Pre-fail  Offline  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed without error  00%  484  -
# 2  Extended offline  Completed without error  00%  312  -
# 3  Extended offline  Completed without error  00%  188  -
# 4  Short offline  Completed without error  00%  0  -


Both have dozens of reallocated sectors, and I'm thinking they should be replaced. Am I being too picky?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Drives are a little hot but they are passing extended tests so they seem OK to me. I would watch them to see if those values keep going up. If they do then you have a problem.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Yeah, I'm having trouble keeping the temps down in the new chassis without turning up the fans to jet engine levels. I think I ultimately need to get some HVAC into the server closet; for the time being I have a fairly large household fan blowing at the front of the server.

I'd honestly just as soon keep these drives and add them to my pool, rather than wait a couple of weeks for their replacements to arrive and get burned in, but I don't want to get stuck with replacing two drives in short order.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527
I'm sad to hear this, I bought 8 of those from Amazon and they are working perfectly so far. How well were your drives packaged? Are you running badblocks on them?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The six drives came in a box designed for shipping lots of hard drives. The two replacements came wrapped in quite a bit of bubble wrap. Yes, I've run badblocks on them, and it's in the course of doing that that the reallocated sectors appeared.
 

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527
Huh, interesting. Mine came in a big box for 24 drives as well. I hope you just got a bad batch and that these aren't just subpar drives. I want to recommend these to people.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I've run badblocks on them, and it's in the course of doing that that the reallocated sectors appeared.
196 Reallocated_Event_Count 0x0032 196 196 000 Old_age Always - 4
So, badblocks does 4 write passes, and sectors only get reallocated by a failed write. The circumstantial evidence suggests that each write pass led to sectors being reallocated. I regard that as more troubling than a single reallocation event.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
That's a good point. Seems one way to test that hypothesis would be to run badblocks again and see if there's any change. Starting that now, should be done in about 4 days.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
30 hours into the test, no change in the numbers.
Kinda sounds like they were diverted from the usual refurb flow and did not have their bad sectors remapped.

What do they report themselves to be?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
What do they report themselves to be?
Code:
[root@freenas2] ~# smartctl -a /dev/da13
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WL6000GSA6457
Serial Number:    WOL240336066
LU WWN Device Id: 0 000000 000000000
Firmware Version: 01.00F.3
User Capacity:    6,001,424,400,384 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS (minor revision not indicated)
Local Time is:    Sat May 21 06:58:18 2016 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


65 hours in to badblocks, and still no new errors.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
OK, badblocks finished all four passes last night, with no change to the reallocated sectors or events (edit: and no errors reported by badblocks either). Nothing showing offline/uncorrectable or pending. I kicked off long SMART tests on all six of the drives last night, but they hadn't finished by the time I left for work this morning.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
That seems consistent with the hypothesis I mentioned above. I guess that makes them usable, with caution.
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I went ahead and added them to the pool, with the usual SMART monitoring and testing (short daily, long weekly, scrub every two weeks). But at the first sign of (further) trouble, I think I'll be ordering a WD Red as a replacement.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Well, the two suspect drives are now failing SMART tests. Actually, they have been for at least the last three weeks, but the system hasn't been doing anything to notify about that. I guess I should have listened to my gut.

For those keeping track at home, that's four out of the six drives that I bought that have failed within a few months.
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Status
Not open for further replies.
Top