Disk error, what to do?

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
I have this in the notifications:
Device: /dev/da3 [SAT], Self-Test Log error count increased from 1 to 2
Fri, 01 Nov 2019 02:43:27 GMT

What am I supposed to do with it? I can't even check the SMART values anywhere. What is log error count anyway?
I have no idea what is the system telling me!
 

Alecmascot

Guru
Joined
Mar 18, 2014
Messages
1,177
Via Google :
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
On the topic of the shell output, I can't do this with putty, because all attempts at using smartctl end up with permission denied error. If i do it from within Freenas, the formatting is broken.
I can't SSH with root, and my user with sudo permissions can't do this, so I am pretty lost at this point.
 

Fredda

Guru
Joined
Jul 9, 2019
Messages
608
Device: /dev/da3 [SAT], Self-Test Log error count increased from 1 to 2
What am I supposed to do with it?
This probably means a self test (long or short) failed to complete. This is usually a strong indication of a failing disc.
If i do it from within Freenas, the formatting is broken.
Pipe the output of smartctl into a file which you can access via SMB.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
How? I am not a Linux user, so I don't know any commands.
I'd much rather be able to get putty to work.
 

Fredda

Guru
Joined
Jul 9, 2019
Messages
608
How? I am not a Linux user, so I don't know any commands.
smartctrl -a /dev/your_hd_id > /path/to/smb_share
I strongly recommend for anyone using FreeNAS learning at least some basic shell commands.
There are several tutorials for beginners online.
I'd much rather be able to get putty to work.
You'll need to allow root login with ssh in the ssh service config.
BTW, I personally prefer MobaXTerm instead of putty.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
How would I give permission to non-root users? Though I am not sure what else might be needed since the sudo checkbox is checked for the user I am using already.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
Type SU once you are logged in with putty and enter the root password.
Logged in with what?
It doesn't work either way.
I enabled the root login in SSH settings, then logged in as root and typed su.
Relogged to the other user and the smartctl command still says permission denied.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
Anyway, with root, I got smartctl output in readable format:
Code:
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E3JL005S
LU WWN Device Id: 5 0014ee 20ca227ed
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Nov  2 15:08:51 2019 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 118) The previous self-test completed having
                                        the read element of the test failed.
Total time to complete Offline
data collection:                (53940) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off supp                         ort.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 539) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x703d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    40
  3 Spin_Up_Time            POS--K   209   178   021    -    6533
  4 Start_Stop_Count        -O--CK   097   097   000    -    3640
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   070   070   000    -    22059
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   099   099   000    -    1722
192 Power-Off_Retract_Count -O--CK   200   200   000    -    79
193 Load_Cycle_Count        -O--CK   199   199   000    -    3649
194 Temperature_Celsius     -O---K   113   106   000    -    39
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    3
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      6  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log
0xa8-0xb6  GPL,SL  VS       1  Device vendor specific log
0xb7       GPL,SL  VS      39  Device vendor specific log
0xbd       GPL,SL  VS       1  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL     VS      93  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 2
        CR     = Command Register
        FEATR  = Features Register
        COUNT  = Count (was: Sector Count) Register
        LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
        LH     = LBA High (was: Cylinder High) Register    ]   LBA
        LM     = LBA Mid (was: Cylinder Low) Register      ] Register
        LL     = LBA Low (was: Sector Number) Register     ]
        DV     = Device (was: Device/Head) Register
        DC     = Device Control Register
        ER     = Error register
        ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 [1] occurred at disk power-on lifetime: 20944 hours (872 days + 16 hours                         )
  When the command that caused the error occurred, the device was active or idle                         .

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  04 -- 51 00 00 00 00 c2 92 cd 50 00 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  ea 00 00 00 00 00 00 00 00 00 00 40 00  3d+20:13:07.096  FLUSH CACHE EXT
  61 00 08 00 00 00 00 7e bf 75 28 40 00  3d+20:13:07.096  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 cc 49 7a 48 40 00  3d+20:13:07.096  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 cc 49 7a 28 40 00  3d+20:13:07.096  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 cc 49 7a 10 40 00  3d+20:13:07.096  WRITE FPDMA QUEUED

Error 1 [0] occurred at disk power-on lifetime: 17742 hours (739 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle                         .

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  04 -- 51 00 00 00 01 ac 0c 10 c0 00 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  ea 00 00 00 00 00 00 00 00 00 00 40 00 12d+20:16:24.519  FLUSH CACHE EXT
  61 00 08 00 00 00 00 34 73 c2 78 40 00 12d+20:16:24.519  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 ac 0c 10 b8 40 00 12d+20:16:24.519  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 86 22 07 e8 40 00 12d+20:16:24.519  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 24 1a 28 40 40 00 12d+20:16:24.519  WRITE FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA                         _of_first_error
# 1  Extended offline    Completed: read failure       60%     22058         326                         4400720
# 2  Extended offline    Completed: read failure       60%     22023         326                         4400720
# 3  Short offline       Completed without error       00%     21903         -
# 4  Short offline       Completed without error       00%     21731         -
# 5  Short offline       Completed without error       00%     21564         -
# 6  Short offline       Completed without error       00%     21396         -
# 7  Extended offline    Completed: read failure       60%     21280         326                         4400720
# 8  Short offline       Completed without error       00%     21228         -
# 9  Short offline       Completed without error       00%     21061         -
#10  Short offline       Completed without error       00%     20898         -
#11  Short offline       Completed without error       00%     20725         -
#12  Short offline       Completed without error       00%     20557         -
#13  Short offline       Completed without error       00%     20389         -
#14  Short offline       Completed without error       00%     20222         -
#15  Short offline       Completed without error       00%     20054         -
#16  Short offline       Completed without error       00%     19893         -
#17  Extended offline    Completed without error       00%     19824         -
#18  Short offline       Completed without error       00%     19718         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       258 (0x0102)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    39 Celsius
Power Cycle Min/Max Temperature:     31/42 Celsius
Lifetime    Min/Max Temperature:      3/46 Celsius
Under/Over Temperature Limit Count:   0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -41/85 Celsius
Temperature History Size (Index):    478 (103)

Index    Estimated Time   Temperature Celsius
 104    2019-11-02 07:11    41  **********************
 ...    ..( 32 skipped).    ..  **********************
 137    2019-11-02 07:44    41  **********************
 138    2019-11-02 07:45    42  ***********************
 ...    ..( 91 skipped).    ..  ***********************
 230    2019-11-02 09:17    42  ***********************
 231    2019-11-02 09:18    41  **********************
 ...    ..( 37 skipped).    ..  **********************
 269    2019-11-02 09:56    41  **********************
 270    2019-11-02 09:57    40  *********************
 ...    ..( 11 skipped).    ..  *********************
 282    2019-11-02 10:09    40  *********************
 283    2019-11-02 10:10    39  ********************
 ...    ..( 13 skipped).    ..  ********************
 297    2019-11-02 10:24    39  ********************
 298    2019-11-02 10:25    38  *******************
 ...    ..(244 skipped).    ..  *******************
  65    2019-11-02 14:30    38  *******************
  66    2019-11-02 14:31    39  ********************
 ...    ..(  6 skipped).    ..  ********************
  73    2019-11-02 14:38    39  ********************
  74    2019-11-02 14:39    40  *********************
 ...    ..( 10 skipped).    ..  *********************
  85    2019-11-02 14:50    40  *********************
  86    2019-11-02 14:51    41  **********************
 ...    ..( 16 skipped).    ..  **********************
 103    2019-11-02 15:08    41  **********************

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            3  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            4  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x8000  4      1399266  Vendor specific
 

Fredda

Guru
Joined
Jul 9, 2019
Messages
608
Num Test_Description Status Remaining LifeTime(hours) LBA _of_first_error
# 1 Extended offline Completed: read failure 60% 22058 326 4400720
# 2 Extended offline Completed: read failure 60% 22023 326 4400720
Smart test are failing due to read errors. You should replace that drive.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
I'll see if I can RMA it.

And the other question?
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
 

Fredda

Guru
Joined
Jul 9, 2019
Messages
608
These are actually two questions. First question is answered in post #11 second question is answered in post #4
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
Sigh.
I am trying to get non-root account to be able to use the smartctl command.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977

Fredda

Guru
Joined
Jul 9, 2019
Messages
608
Sigh.
I am trying to get non-root account to be able to use the smartctl command.
Sigh, if you want to know something, you should ask that questions and not let the people around here guess, what you might want to know.
smartctl is a command which works low level on the hardware, those kind of commands always need root privileges.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Did you even read my post $9 which I even linked here again in post #14?
Of course, did you read mine? Twice?

Good luck getting things figured out.
 
Top