"Critical error - ata2 error count increased from 0 to 1"

Louis_Cyphre

Cadet
Joined
Aug 28, 2020
Messages
6
So, i only have 8TB WD RED drives in my raid, with the exception of one 8TB Ironwolf disk. Two of the WD drives are 2 year old and all the others, including the Seagate are 1 year old. A couple of days ago i got this error "Critical error - ata2 error count increased from 0 to 1".

Here is the question, is my drive failing? Only found some old thread on this and a thread where someone had this error on all of his drives.

I have heard that the Ironwolf is not as reliable as the WD's, but after one year? My drives are running at temperatures just around 37-41°

Intel i5-9600K
Asus Prime B365M -A
16GB DDR4-2666
4x WD 8TB RED
1x Seagate 8TB Ironwolf
Intel Gigabit PCI-E EXPI9301CTBLK

Code:
root@freenas[~]# smartctl -a /dev/ada2
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p14 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org


=== START OF INFORMATION SECTION ===
Model Family:     Seagate IronWolf
Device Model:     ST8000VN004-2M2101
Serial Number:    WSD5NE4V
LU WWN Device Id: 5 000c50 0e3701740
Firmware Version: SC60
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Dec  6 00:39:22 2022 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  559) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 698) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x50bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.


SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   084   064   044    Pre-fail  Always       -       226168952
  3 Spin_Up_Time            0x0003   084   084   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       9
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       64
  7 Seek_Error_Rate         0x000f   082   060   045    Pre-fail  Always       -       158797125
  9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       6784
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       9
 18 Head_Health             0x000b   100   100   050    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   098   098   000    Old_age   Always       -       2
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   060   055   040    Old_age   Always       -       40 (Min/Max 29/45)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       6
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       291
194 Temperature_Celsius     0x0022   040   045   000    Old_age   Always       -       40 (0 26 0 0 0)
195 Hardware_ECC_Recovered  0x001a   084   064   000    Old_age   Always       -       226168952
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       6772 (146 91 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       13728089576
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       36338870808


SMART Error Log Version: 1
ATA Error Count: 2
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.


Error 2 occurred at disk power-on lifetime: 6615 hours (275 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.


  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455


  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 58 ff ff ff 4f 00  28d+04:44:09.104  READ FPDMA QUEUED
  60 00 58 ff ff ff 4f 00  28d+04:44:09.104  READ FPDMA QUEUED
  61 00 08 ff ff ff 4f 00  28d+04:44:03.427  WRITE FPDMA QUEUED
  60 00 50 ff ff ff 4f 00  28d+04:44:02.288  READ FPDMA QUEUED
  60 00 58 ff ff ff 4f 00  28d+04:44:01.780  READ FPDMA QUEUED


Error 1 occurred at disk power-on lifetime: 6364 hours (265 days + 4 hours)
  When the command that caused the error occurred, the device was active or idle.


  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455


  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 58 ff ff ff 4f 00  17d+18:03:46.415  READ FPDMA QUEUED
  61 00 08 ff ff ff 4f 00  17d+18:03:46.415  WRITE FPDMA QUEUED
  60 00 50 ff ff ff 4f 00  17d+18:03:46.414  READ FPDMA QUEUED
  60 00 b0 ff ff ff 4f 00  17d+18:03:35.781  READ FPDMA QUEUED
  60 00 50 ff ff ff 4f 00  17d+18:03:34.017  READ FPDMA QUEUED


SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      6770         -


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
I would certainly be on the watch and be extra paranoid about my backups.
 

Louis_Cyphre

Cadet
Joined
Aug 28, 2020
Messages
6
I would certainly be on the watch and be extra paranoid about my backups.
Well, true, i have Raidz2 so i'm not exactly in dire straits, since 2 drives can fail without the raid failing. I am on the other hand setting up a hot spare...
 
Top