Query about SMART test results on 20TB seagate drives

Joined
Jan 25, 2022
Messages
3

Background​

I am setting up a new NAS system that will use TrueNAS. (The RAM is ECC incase that info is useful)
It has the following hard drive setup - all drives have been bought brand new and are still under warranty.
I am trying to ensure the drives are healthy before beginning to use them.

OS Drive:
1x SSD M2.NVME drive (for the OS)

ZPool Drives (RAIDZ2):
4x Seagate IronWolf NAS drives - Model Number: ST20000NE000 20TB each
2x bought from vendor A
1x bought from vendor B
1x bought from vendor C

I am following the excellent guide by Uncle Fester and updated and hosted on danb35's Wiki:
https://www.familybrown.org/dokuwiki/doku.php?id=fester112:hvalid_hdd
After following the instructions on the 'Hardware Validation -> HDD/SSD Validation' section, below is what I have done so far.

Tests I have performed​

Here are what tests I have run and in what order I ran them for each of the 4 seagate 20TB drives:
SMART long test took around 24-26 hours for anyone interested.

1) Ran smartctl:
Code:
smartctl -t short /dev/ada{0..3}
smartctl -t convey /dev/ada{0..3}
smartctl -t long /dev/ada{0..3}


2) Running badblocks (Took well over 150 hours - Quite the coffee break!):
Then I ran the following badblocks on each of the 4 drives:
Code:
badblocks -b 32768 -c 512 -wsv /dev/ada{0..3}


3) Re-Ran smartctl again after badblocks completed without any errors:
Code:
smartctl -t short /dev/ada{0..3}
smartctl -t convey /dev/ada{0..3}
smartctl -t long /dev/ada{0..3}

Test Results:​

badblocks test results:​

All four drives successfully completed the 4 iterations of write/read without a single error.

SMART test results - The Good:​

To start with the good news, each of the four drives have good looking Self-test logs:

Code:
SMART overall-health self-assessment test result: PASSED


SMART Error Log Version: 1

No Errors Logged


SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Extended offline    Completed without error       00%       269         -

# 2  Conveyance offline  Completed without error       00%       243         -

# 3  Short offline       Completed without error       00%       243         -

# 4  Extended offline    Completed without error       00%        28         -

# 5  Conveyance offline  Completed without error       00%         2         -

# 6  Short offline       Completed without error       00%         2         -



## SMART test results - The Bad:​

According to the guide, the following attributes should all have zero values:

Code:
Raw_Read_Error_Rate
Reallocated_Sector_Ct
Seek_Error_Rate
Spin_Retry_Count
Calibration_Retry_Count
Reallocated_Event_Count
Current_Pending_Sector
Offline_Uncorrectable
UDMA_CRC_Error_Count


As a side note, I noticed that the following 2 attributes are not present in any of the SMART results for my drives:
Code:
Calibration_Retry_Count = not present
Reallocated_Event_Count = not present


However, in my results the following have non-zero values under the 'RAW_VALUE' column as shown below:
I am only including the attributes which failed to have zero results from the above list of attributes.

Code:
ID# ATTRIBUTE_NAME          FLAG   VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

----

1 Raw_Read_Error_Rate     0x000f   081   064   044    Pre-fail  Always       -       124361665

1 Raw_Read_Error_Rate     0x000f   083   064   044    Pre-fail  Always       -       217428929

1 Raw_Read_Error_Rate     0x000f   077   064   044    Pre-fail  Always       -       46461889

1 Raw_Read_Error_Rate     0x000f   082   064   044    Pre-fail  Always       -       141671361

----

7 Seek_Error_Rate         0x000f   080   060   045    Pre-fail  Always       -       106182699

7 Seek_Error_Rate         0x000f   080   060   045    Pre-fail  Always       -       106791513

7 Seek_Error_Rate         0x000f   080   060   045    Pre-fail  Always       -       106481643

7 Seek_Error_Rate         0x000f   080   060   045    Pre-fail  Always       -       106904772



My Question​

As shown above, I am getting positive values in some of the results, whereas in the guide it says I should be getting '0' values.
The guide also says:
"If you get any value other than zero in the “RAW VALUE” for these entries you should be suspicious of this drive and may need to return the device for testing depending on the manufacturer’s warranty"

The fact that badblocks ran successfully and that I sourced the 4 drives from 3 different vendors makes me feel it is unlikely that they all have similar faults and maybe the guide doesn't account for a possible change in hard drive SMART results/or technology?? I have no knowledge about how hard drives actually work so this is just an optimistic thought I had that would allow me to believe the hard drives are fine - HAHaha... (crying).

I would really appreciate any input about these results and whether the drives should be returned or I can begin to trust them and go ahead and start using them.

Thank you TrueNAS community!
 
Joined
Oct 22, 2019
Messages
3,641
Those are Seagates. You're fine. They have their own RAW values, and each manufacturer is cryptic about it anyways.

If badblocks succeeds without errors, and the SMART summary reports no errors, and the self-tests all passed, it's safe to say your drives are fine. Plus, you've got ZFS on top of all of that with scrubs, checksums, and self-healing.

Setup regular self-tests, keep an eye on your alerts, and occasionally check your drive's SMART summaries.

I, personally, have a weekly task that runs short self-tests on my spinning HDDs, and I will occasionally run a long self-test manually (rarely.)
 
Joined
Oct 22, 2019
Messages
3,641
I can relax now and start using the drives.:smile:
It's always reassuring when yo-, wait. Wait! I just realized something from your initial post.

One of the drives reported a RAW value of 124361665!

124361665 is Seagate's internal code for "IMMINENT DRIVE FAILURE". At first I thought it read 124361662, which means "Everything is fine".

Oh nnnnnnnnnnnnooooooooooooooooooo!!!11 :eek: :eek: :eek:
 
Joined
Jan 25, 2022
Messages
3
It's always reassuring when yo-, wait. Wait! I just realized something from your initial post.

One of the drives reported a RAW value of 124361665!

124361665 is Seagate's internal code for "IMMINENT DRIVE FAILURE". At first I thought it read 124361662, which means "Everything is fine".

Oh nnnnnnnnnnnnooooooooooooooooooo!!!11 :eek: :eek: :eek:
I've sent all 4 drives back to the manufacturers this morning.
The postage cost a lot, but it will be worth it when I get replacements which should surely return "124361662" values.

Thank you winnie!!
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
It's always reassuring when yo-, wait. Wait! I just realized something from your initial post.

One of the drives reported a RAW value of 124361665!

124361665 is Seagate's internal code for "IMMINENT DRIVE FAILURE". At first I thought it read 124361662, which means "Everything is fine".

Oh nnnnnnnnnnnnooooooooooooooooooo!!!11 :eek: :eek: :eek:
Great information - where did you find that reference? Would love to know more as I am currently testing 4 20TB seagate drives in my system.
 

souporman

Explorer
Joined
Feb 3, 2015
Messages
57
Great information - where did you find that reference? Would love to know more as I am currently testing 4 20TB seagate drives in my system.
He was being sarcastic, and so was OP about sending the disks back. Ignore the last two posts and the information in here is good.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925

souporman

Explorer
Joined
Feb 3, 2015
Messages
57
This is a really old post, I doubt he cares. Also it's just an HTTP site, MalwareBytes doesn't like those. If you run the URL through Virus Total it comes back clean... but again, this is a very old post and I doubt anybody cares.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Those are Seagates. You're fine. They have their own RAW values, and each manufacturer is cryptic about it anyways.

Seagate's decoder ring is that the RAW values are 48-bit integers - the first 16 bits show the error count, the next 32 is the number of operations.

So unless your RAW value is greater than 2^32 (4,294,967,296) you have 0 errors, and the RAW value is the number of operations the disk has served.
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
He was being sarcastic, and so was OP about sending the disks back. Ignore the last two posts and the information in here is good.
Bwahahahahahahahah. I was tired last night. I actually thought about it as I was brushing my teeth this morning and finally realized I am dumb.

Well played... well played.
 
Top