Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Checking for TLER, ERC, etc. support on a drive

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
11,784
Thanks
3,041
#1
One of the problems with consumer-grade hard drives is that most of them will hang in the event that they run into an error, and will internally retry the operation, possibly for a minute or more. For a desktop PC, where redundancy does not exist, this is the correct course of action, because failure of a sector means loss of the data.

Enterprise class drives typically support the ability to limit the amount of time a drive wastes trying to recover data. Most of these drives are used in RAID arrays, and so in the event of a failure, the data can be recovered from parity. A drive encountering read errors cannot be allowed to hang for large amounts of time, because this stalls whatever the server is trying to do. So manufacturers include features to control the retries of failures.

For Western Digital, this is called TLER - Time-Limited Error Recovery. Great PDF.

For Seagate, it is called ERC - Error Recovery Control.

Samsung and Hitachi call it CCTL.

Some people are confused and think that these features are only necessary for hardware RAID, or aren't useful for software RAID. It is absolutely true that this is a very important feature for hardware RAID, because a hardware RAID controller is probably configured to deem a "hung" hard drive as failed and to place it in an offline or recovery status, which has many negatives associated with it. So you absolutely do want TLER/ERC/etc for a hardware RAID setup.

But what about ZFS?

If you've got a ZFS pool, and your underlying disk device appears to hang for a minute, you probably stop serving up data. This is likely to be bad behaviour for a filer. Unlike a hardware RAID controller, ZFS will typically wait for the command to complete, and if it is trying to read many sectors, this could take a very long time. So TLER/ERC/etc are also desirable properties for a ZFS system.

We've been thrilled in recent years to see the addition of "NAS" class hard drives, which are essentially conventional consumer-grade hard drives that have firmware that defaults to supporting TLER/ERC.

You can verify that a drive has TLER/ERC turned on by probing it with smartctl.

Code:
# smartctl -l scterc /dev/ada0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p8 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

SCT Error Recovery Control:
           Read: Disabled
          Write: Disabled


That doesn't have it.

Code:
# smartctl -l scterc /dev/ada4
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p8 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)


That does, and it's set to a typical 7 seconds. Further, the same command can be used to try to set ERC.

Code:
# smartctl -l scterc,80,80 /dev/ada4
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p8 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

SCT Error Recovery Control set to:
           Read:     80 (8.0 seconds)
          Write:     80 (8.0 seconds)


Some hard drives may not come with TLER/ERC enabled by default but can have it turned on regardless. If you try this, make sure to power cycle the drive to make sure the setting sticks around. It's hard to test for TLER/ERC working correctly without actually encountering a bad drive, however.

[2015-02-10] : I note that we just picked up some Samsung ST2000LM003 2.5" 2TB drives which appear to allow TLER to be set, but the setting appears to do nothing and isn't persistent. I happened to luck out in that a drive failed SMART testing with a bad sector and was therefore easily tested.

I'll be pruning responses to this thread, but if you have useful information to share, I may update this post and credit you.
 

ian351c

FreeNAS Experienced
Joined
Oct 20, 2011
Messages
191
Thanks
27
#2
Good stuff! Checked all my HDDs and fount that only two of them had this enabled (out of six). I'd love to see FreeNAS check for drives with this capability and give me a yellow warning light if it's not enabled when it could be.

Prune away!
 
Joined
Jun 25, 2013
Messages
36
Thanks
3
#3
Western Digital Green series don't have it.
Seagate ST8000AS0002 no TLER/ERC/SCT too, also the APM status returned by the 8TB drive is reported as not ATA compliant by smartd.

The Toshiba (Hitachi) 2/3TB ABA have it but not enabled by default.
 
Last edited:

Sir.Robin

FreeNAS Experienced
Joined
Apr 14, 2012
Messages
552
Thanks
41
#4
My Barracuda ES ST3750640NS drives does not have it... !!? They are old, but nevertheless... ES my bung.
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,030
Thanks
3,885
#5
My Barracuda ES ST3750640NS drives does not have it... !!? They are old, but nevertheless... ES my bung.
Maybe they're just not configurable via standard ATA commands.
 

Sir.Robin

FreeNAS Experienced
Joined
Apr 14, 2012
Messages
552
Thanks
41
#6
My ES2's does not give me any feedback... but the ES (750GB) says disabled.
 

avalon60

FreeNAS Experienced
Joined
Jan 15, 2014
Messages
472
Thanks
10
#7
I have just check my WD drives and TLER is enabled to 7 secs.
 

avalon60

FreeNAS Experienced
Joined
Jan 15, 2014
Messages
472
Thanks
10
#9
As requested

WD-WCC4J5XC8EYT 1TB Purple
WD-WCC4MP4L4UH9 2TB Red NAS
WD-WMC300412931 2TB Red NAS
WD-WMC300187339 2TB Red NAS
WD-WMC300412900 2TB Red NAS
WD-WCAZAA568514 1TB Green
WD-WCAV5J838621 1TB Green
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,030
Thanks
3,885
#10
So the Greens are settable and/or already have TLER enabled?
 

avalon60

FreeNAS Experienced
Joined
Jan 15, 2014
Messages
472
Thanks
10
#11
The 2 Greens are unsuppported, but the Purple is.
I should have said that before
 
Joined
Jan 12, 2014
Messages
15
Thanks
0
#12
Came across this thread and tried my self.
Seagate 3TB ST3000DM001 - SCT Error Recovery Control command not supported
Hitachi 400GB HDT725040VL - SCT Error Recovery Control: Read: Disabled, Write: Disabled
 

RegularJoe

FreeNAS Experienced
Joined
Aug 19, 2013
Messages
204
Thanks
4
#14
Hi All,

If this is the right command, I can set it for some seagate drives but upon reboot they go back to the values before:

smartctl -l scterc,80,80 /dev/da0
---------------------------------------------------------

Old disks in a lab server :
Can set these but upon reboot they go back to the default off
Seagate Barracuda 7200.10 ST3250310AS Firmware Version: 4.CCB
Seagate Barracuda 7200.12 ST3250318AS Firmware Version: CC66

Can set these but after reboot they go back to the factory settings 25 seconds
SEAGATE(SUN) ST32500NSSUN250G Firmware Version: 3AZQ

Cannot set these:
Seagate Desktop SSHD ST4000DX001-1CE168 Firmware Version: CC44
Seagate Barracuda 7200.12 ST3250318AS Firmware Version: CC66
 
Joined
Aug 14, 2015
Messages
5
Thanks
0
#15
For the drives where it is settable but not persistant over a reboot, is it possible to put the command in some sort of a startup script?
 

rogerh

FreeNAS Guru
Joined
Apr 18, 2014
Messages
1,069
Thanks
118
#16
HP 250GB SATA disk Model VB0250EAVER FirmwareHPG7

Disabled by default but settable to 7 seconds and persists past reboot. But I haven't tried a cold with no mains reboot and don't know if this would make a difference.

This is the stock drive with the N54L Microserver. It is also notable for a "Max recommended Temperature" of 69˚C and a "Max Temperature Limit" of 60˚C.
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,030
Thanks
3,885
#18
"Don't exceed 69 degrees Celsius, but we'll void your warranty if you exceed 60".

Sounds legit.
 

rogerh

FreeNAS Guru
Joined
Apr 18, 2014
Messages
1,069
Thanks
118
#19
"Don't exceed 69 degrees Celsius, but we'll void your warranty if you exceed 60".

Sounds legit.
That's more-or-less what I took it to mean - alternatively just random figures invented on separate days by an engineer with a poor memory.
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
11,784
Thanks
3,041
#20
The cynic in me can't help but read it as "we recommend running it at temperatures above the max temperature limit."
 
Top