SmartD saying offline, but seems fine

OldHoster

Cadet
Joined
Jan 23, 2017
Messages
9
I get these logs from smartd on a couple of drives. I used to have the same error with ada5 before swapped drives and cables re-slivered and the error went away for a week or two then came back now it is doing this on two drives.

Apr 2 09:13:19 StorageBox smartd[2780]: Device: /dev/ada5, 199 Currently unreadable (pending) sectors
Apr 2 09:13:19 StorageBox smartd[2780]: Device: /dev/ada5, 236 Offline uncorrectable sectors
Apr 2 09:13:20 StorageBox smartd[2780]: Device: /dev/ada4, 1 Currently unreadable (pending) sectors
Apr 2 09:13:20 StorageBox smartd[2780]: Device: /dev/ada4, 1 Offline uncorrectable sectors

This is from zpool status of -v Storage vDev that has those two drives in it. This is about 2/3 rd's of the way through a scrub. They are 10tb drives at almost 80% so it is taking a while.

raidz2-0 ONLINE 0 0 0
gptid/fa828580-8ddc-11e6-b33b-305a3a0e09dc ONLINE 0 0 0
gptid/219ca49c-c223-11e8-808d-305a3a0e09dc ONLINE 0 0 0
gptid/235413e7-8d1f-11e6-b3bf-305a3a0e09dc ONLINE 0 0 0
gptid/356ce2d0-8c57-11e6-9632-305a3a0e09dc ONLINE 0 0 0
gptid/b11fd36d-8ac2-11e6-8261-305a3a0e09dc ONLINE 0 0 0
gptid/8ee15510-8a5a-11e6-806d-305a3a0e09dc ONLINE 0 0 0

Should I swap drives cables etc?

-OH
 
Joined
Oct 18, 2018
Messages
969
Would you mind posting more of your system specs? Hardware, FreeNAS version, etc?

I suspect that what you're seeing is perfectly normal for drives which have some bad sectors and not a cable issue. What is the full output of `smartctl -a ` on those devices? It might be time for some drive replacement, especially on the drive with > 200 bad sectors.
 

OldHoster

Cadet
Joined
Jan 23, 2017
Messages
9
Here are the full outputs of each, 11.2 freenas these are 6TB WD Blue Drives. I will replace them if I need to. Just haven't wanted to take the time over a couple of weekends.
 

Attachments

  • ada4.txt
    6.2 KB · Views: 308
  • ada5.txt
    6.1 KB · Views: 358
Last edited:
Joined
Oct 18, 2018
Messages
969
Would you mind using code tags or attaching that as a file? *currently reading*
 
Joined
Oct 18, 2018
Messages
969
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 199
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 236
These indicate drives which are experiencing failed sectors. With this many it may be a sign the drive is starting to fail. You may consider replacing it.

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1
This is not TERRIBLE. I've seen some folks on the forums suggest you can keep an eye on drives with so few bad sectors. If the number doesn't grow you may be alright. If they do grow you may want to replace it. I had a similarly small number of failures recently and just RMAed the drive right away. Even if my drive had not been under warranty I probably would've replaced it anyway rather than risk it.

194 Temperature_Celsius 0x0022 106 093 000 Old_age Always - 46
Looks like you've got some high temps in there. Do your drives always sit > 40?

199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 1
This indicates a possible communication error. You may try replacing the cable or swapping ports you're using and rerunning the tests.

I think this HDD troubleshooting resource may be helpful.

Any chance these drives are still under warranty?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The "offline" in "offline uncorrectable" doesn't mean anything useful. The sectors are bad and that's all that means.
 
Top