Old Critical Sector warning messages won't clear

Gcon · Aug 2, 2015

Hi all,

the web GUI keeps telling me this:

CRITICAL: Device: /dev/ada2, 8 Currently unreadable (pending) sectors
CRITICAL: Device: /dev/ada2, 8 Offline uncorrectable sectors

I replaced the hard drive ada2 with a new one (as per my blog here: http://gavowen.ninja/?p=187) yet the error message won't clear in the GUI. How can I get rid of it? All 4 of my disks are clear of these errors, according to the SMART readout:

Code:

root@sarlacc# smartctl -A /dev/ada0 | egrep -e "^197|^198"
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
root@sarlacc# smartctl -A /dev/ada1 | egrep -e "^197|^198"
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
root@sarlacc# smartctl -A /dev/ada2 | egrep -e "^197|^198"
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
root@sarlacc# smartctl -A /dev/ada3 | egrep -e "^197|^198"
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0

I've done a reboot, and a volume scrub, followed by another reboot, and the warnings still won't go away. Any ideas? Thanks in advance.

INCSlayer · Aug 2, 2015

i belive the command you are looking for is zpool clear [name of pool]

Gcon · Aug 2, 2015

INCSlayer said:
i belive the command you are looking for is zpool clear [name of pool]

I tried that but still the same. These are S.M.A.R.T. errors and not ZFS errors. ZFS thinks everything is happy:

Code:

root@sarlacc# zpool status volume1
  pool: volume1
state: ONLINE
  scan: scrub repaired 0 in 5h5m with 0 errors on Fri Jul 31 15:39:31 2015
config:

        NAME                                            STATE     READ WRITE CKSUM
        volume1                                         ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/b0b75076-ecbd-11e4-a367-a0369f25d4a0  ONLINE       0     0     0
            gptid/b10e16d2-ecbd-11e4-a367-a0369f25d4a0  ONLINE       0     0     0
            gptid/c9e094bf-36c2-11e5-847c-a0369f25d4a0  ONLINE       0     0     0
            gptid/b1c8f76e-ecbd-11e4-a367-a0369f25d4a0  ONLINE       0     0     0

errors: No known data errors

I need FreeNAS to clear its Alert System somehow, and refresh its S.M.A.R.T. error log. I turned off the S.M.A.R.T. service off then back on - no difference. Rebooted with it off - no difference. Removed S.M.A.R.T. from /dev/ada2 - no difference. It's really frustrating to see these errors not clear!

dlavigne · Aug 2, 2015

IIRC, these will automatically clear themselves after a period of time. There was a similar forum post about this a few weeks ago but I unfortunately don't remember the URL to that post.

DifferentStrokes · Aug 2, 2015

I believe that they will also disappear if you reboot.

Robert Trevellyan · Aug 2, 2015

Gcon said:
Any ideas?

Uncheck the box next to that warning in the popup. Eventually it will age out of the logs too.

Gcon · Aug 2, 2015

Things I've tried

Rebooted
Made sure there is a S.M.A.R.T. short-test task that runs for all my disks. This has run with no errors on any 4 of my disks - confirmed with checking S.M.A.R.T. logs, then rebooted
Unticked messages in GUI and rebooted
Toggled S.M.A.R.T. service off then on, then rebooted
Removed S.M.A.R.T. off then back on for affected drive, then rebooted

Have given up now. I hope it times out. My concern is that I get fresh errors and ignore them because I think it's a stale error. I really think this message should disappear the next moment the S.M.A.R.T. task runs on the drive and gleans there are no issues. Oh well... will just wait and see what happens after the next software update.

cyberjock · Aug 18, 2015

Did the errors eventually time out and clear?

Gcon · Aug 18, 2015

cyberjock said:
Did the errors eventually time out and clear?

Unfortunately not. I figure that at some stage I'll be moving to FreeNAS 10 in another year or so and hope that the issue will clear by then, if not before. Until that time I'll just run some manual S.M.A.R.T. tools (smartctl) on the CLI from time to time. I've just unticked the messages for now.

cyberjock · Aug 18, 2015

That's odd... Can you attach a debug file? System -> Advanced -> Save Debug.

I'm curious why this is happening to you. :P

Gcon · Aug 19, 2015

cyberjock said:
That's odd... Can you attach a debug file? System -> Advanced -> Save Debug.

I'm curious why this is happening to you. :p

As requested here's the debug of the "Sarlacc" - a ravenous NAS that just loves to be fed! :p

rogerh · Aug 19, 2015

My casual observation is that unticked errors remain in the alert box "forever", until they are displaced by new errors, in sufficient numbers for them to scroll off the bottom of the box. However, I think new errors of the same description do result in new alerts. I think.

Gcon · Aug 19, 2015

Success!

I happened to be digging through the /tmp directory and found this file called ".smartalert".

Code:

-rw-r--r--   1 root  wheel  uarch  153 Jul 30 23:27 .smartalert

root@sarlacc#     cat .smartalert
(dp1
S'/dev/ada2'
p2
(lp3
S'Device: /dev/ada2, 8 Currently unreadable (pending) sectors'
p4
aS'Device: /dev/ada2, 8 Offline uncorrectable sectors'
p5

root@sarlacc# date
Wed Aug 19 21:47:06 AEST 2015
root@sarlacc#

Seems like it was hanging around like a bad smell. I removed the file and rebooted - no more issues with stale alerts! :)

viniciusferrao · Jan 8, 2016

I'm with the same issue, in my FreeNAS box is a PITA to read the alerts:

Code:

WARNING: Firmware version 16 does not match driver version 20 for /dev/mps0
 CRITICAL: Device: /dev/da11 [SAT], 8 Currently unreadable (pending) sectors
 CRITICAL: Device: /dev/da11 [SAT], 8 Offline uncorrectable sectors
 CRITICAL: Device: /dev/da11 [SAT], Self-Test Log error count increased from 0 to 1
 CRITICAL: Device: /dev/da11 [SAT], ATA error count increased from 0 to 3
 CRITICAL: Device: /dev/da3 [SAT], not capable of SMART self-check
 CRITICAL: Device: /dev/da3 [SAT], failed to read SMART Attribute Data
 CRITICAL: Device: /dev/da3 [SAT], Read SMART Self-Test Log Failed
 CRITICAL: Device: /dev/da3 [SAT], Read SMART Error Log Failed
 CRITICAL: Device: /dev/da14 [SAT], 16 Currently unreadable (pending) sectors
 CRITICAL: Device: /dev/da14 [SAT], 16 Offline uncorrectable sectors
 CRITICAL: Device: /dev/da14 [SAT], 8 Currently unreadable (pending) sectors
 CRITICAL: Device: /dev/da14 [SAT], 8 Offline uncorrectable sectors
 CRITICAL: Device: /dev/da14 [SAT], Self-Test Log error count increased from 0 to 1
 CRITICAL: Device: /dev/da14 [SAT], ATA error count increased from 0 to 1
 CRITICAL: Device: /dev/da14 [SAT], 352 Currently unreadable (pending) sectors
 CRITICAL: Device: /dev/da14 [SAT], 352 Offline uncorrectable sectors
 CRITICAL: Device: /dev/da14 [SAT], ATA error count increased from 4 to 6
 CRITICAL: Device: /dev/da14 [SAT], unable to open device
 CRITICAL: Device: /dev/da5 [SAT], Self-Test Log error count increased from 0 to 1
 CRITICAL: Device: /dev/da5 [SAT], not capable of SMART self-check
 CRITICAL: Device: /dev/da5 [SAT], failed to read SMART Attribute Data
 CRITICAL: Device: /dev/da5 [SAT], Read SMART Self-Test Log Failed
 CRITICAL: Device: /dev/da5 [SAT], Read SMART Error Log Failed
 OK: There is a new update available! Apply it in System -> Update tab.

I will perform a removal of the /tmp/.smartalert as noted by Gcon to see if this works here too.

Guille · Jan 11, 2018

Gcon said:
Success!

I happened to be digging through the /tmp directory and found this file called ".smartalert".

Code:
-rw-r--r-- 1 root wheel uarch 153 Jul 30 23:27 .smartalert root@sarlacc# cat .smartalert (dp1 S'/dev/ada2' p2 (lp3 S'Device: /dev/ada2, 8 Currently unreadable (pending) sectors' p4 aS'Device: /dev/ada2, 8 Offline uncorrectable sectors' p5 root@sarlacc# date Wed Aug 19 21:47:06 AEST 2015 root@sarlacc#

Seems like it was hanging around like a bad smell. I removed the file and rebooted - no more issues with stale alerts! :)

Man I love you.

A. Schmidt · Mar 24, 2018

Gcon said:
Success!

I happened to be digging through the /tmp directory and found this file called ".smartalert".

Code:
-rw-r--r-- 1 root wheel uarch 153 Jul 30 23:27 .smartalert root@sarlacc# cat .smartalert (dp1 S'/dev/ada2' p2 (lp3 S'Device: /dev/ada2, 8 Currently unreadable (pending) sectors' p4 aS'Device: /dev/ada2, 8 Offline uncorrectable sectors' p5 root@sarlacc# date Wed Aug 19 21:47:06 AEST 2015 root@sarlacc#

Seems like it was hanging around like a bad smell. I removed the file and rebooted - no more issues with stale alerts! :)

You are my hero!

Pseudobolt · Oct 5, 2018

Gcon said:
Success!

I happened to be digging through the /tmp directory and found this file called ".smartalert".

...

Seems like it was hanging around like a bad smell. I removed the file and rebooted - no more issues with stale alerts! :)

So this is still a thing. Cheers for the tip!

Important Announcement for the TrueNAS Community.

Old Critical Sector warning messages won't clear

Gcon

Explorer

INCSlayer

Contributor

Gcon

Explorer

dlavigne

Guest

DifferentStrokes

Patron

Robert Trevellyan

Pony Wrangler

Gcon

Explorer

cyberjock

Inactive Account

Gcon

Explorer

cyberjock

Inactive Account

Gcon

Explorer

rogerh

Guru

Gcon

Explorer

viniciusferrao

Contributor

Guille

Dabbler

A. Schmidt

Cadet

Pseudobolt

Dabbler

Similar threads