Kernel panic due to vdev disk hangs, inconsistent disks, need help troubleshooting

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
Hey All,

I haven't found a ton of information related to my problem on the forums, so I'm posting to see if anyone has some ideas to help troubleshoot.

A ~week ago I had a mysterious reboot:
I/O to pool 'tank' appears to be hung on vdev guid 12989182611123154543 at '/dev/gptid/51b4e5cf-99c7-11e9-a5e3-3860778b006f'.

After researching I assumed this meant that one particular drive was about to give up the ghost, so I was preparing to swap in a new drive and the box rebooted again (10/7), this time a different GPTID was cited as the problem. I've now had five reboots with all four disks showing up at least once:
Oct 5 19:03 I/O to pool 'tank' appears to be hung on vdev guid 12989182611123154543 at '/dev/gptid/51b4e5cf-99c7-11e9-a5e3-3860778b006f'.
Oct 7 04:45 I/O to pool 'tank' appears to be hung on vdev guid 15794669731177043768 at '/dev/gptid/62a87797-74e4-11e8-80b0-3860778b006f'.
Oct 10 19:50 I/O to pool 'tank' appears to be hung on vdev guid 8409777197129091168 at '/dev/gptid/d15a6389-8ba8-11e9-b5ab-3860778b006f'.
Oct 13 17:45 I/O to pool 'tank' appears to be hung on vdev guid 15794669731177043768 at '/dev/gptid/62a87797-74e4-11e8-80b0-3860778b006f'
Oct 14 19:16 I/O to pool 'tank' appears to be hung on vdev guid 2622870734464222642 at '/dev/gptid/6e2c1e3d-9251-11e9-b4ef-3860778b006f'.

I've run short and conveyance SMART tests on all of the disks in my pool (4x4tb drives in Z2, pool details below, 16gb of RAM, disk space only about 51% used) with zero errors. However, I have not been able to complete any long SMART tests because the machine has now rebooted every couple days and the test will not complete in that time. No issues with a scrub completed on Sunday night.

smartctl -a /dev/ada0 :
Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Serial Number:    ZFN16ANN
LU WWN Device Id: 5 000c50 0b013b768
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Oct 14 20:21:05 2020 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 249) Self-test routine in progress...
                                        90% of test remaining.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 487) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x30a5) SCT Status supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   076   064   006    Pre-fail  Always       -       41783703
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       34
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   095   060   045    Pre-fail  Always       -       3063684620
  9 Power_On_Hours          0x0032   077   077   000    Old_age   Always       -       20329 (185 104 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       24
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   057   049   040    Old_age   Always       -       43 (Min/Max 38/49)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       718
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       872
194 Temperature_Celsius     0x0022   043   051   000    Old_age   Always       -       43 (0 29 0 0 0)
195 Hardware_ECC_Recovered  0x001a   076   064   000    Old_age   Always       -       41783703
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   191   000    Old_age   Always       -       23
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       20239h+32m+43.634s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       68382963370
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       90268081333

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 90%     20329         -
# 2  Extended offline    Interrupted (host reset)      90%     20328         -
# 3  Extended offline    Interrupted (host reset)      80%     20302         -
# 4  Conveyance offline  Completed without error       00%     20270         -
# 5  Short offline       Completed without error       00%     20270         -
# 6  Short offline       Completed without error       00%     20270         -
# 7  Short offline       Completed without error       00%     20263         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


so now, I am not so sure that it is one disk that is the problem. However, also strangely, this part is always the same from the /data/crash/info.x logs:
Dump header from device: /dev/ada3p1
does this indicate that /dev/ada3 could be a problematic disk?

My main questions:
1. Does any of this seem to indicate that _only_ one disk is failing?
2. Can I do any more investigation to figure out the cause?
3. Is there a possibility that the controller on the main board is failing?
3a. if this is the case, would getting a dedicated controller card be a better solution than a new mainboard (machine is from 2013, I guess I'm trying to ride it until the wheels fall off)
4. Could suddenly bad memory be at fault here? (I am running 16GB of ECC)
5. Why is a kernel panic when a drive starting to fail an acceptable solution? (rhetorical, lol, not very graceful)

I am trying to avoid throwing money at parts. Things have been going swimmingly for many years with this machine.

appendix
===

system specs:
FreeNAS-11.2-U7 (Build Date: Nov 19, 2019 0:4)
Intel(R) Pentium(R) CPU G860 @ 3.00GHz (2 cores)
16 GiB

zpool status
Code:
  pool: tank
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0 in 0 days 20:07:13 with 0 errors on Sun Oct 11 20:16:56 2020
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/d15a6389-8ba8-11e9-b5ab-3860778b006f  ONLINE       0     0     0
            gptid/6e2c1e3d-9251-11e9-b4ef-3860778b006f  ONLINE       0     0     0
            gptid/51b4e5cf-99c7-11e9-a5e3-3860778b006f  ONLINE       0     0     0
            gptid/62a87797-74e4-11e8-80b0-3860778b006f  ONLINE       0     0     0


smartctl -a /dev/ada1
Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Serial Number:    ZFN16HLM
LU WWN Device Id: 5 000c50 0b013f937
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Oct 14 20:42:30 2020 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (  41)    The self-test routine was interrupted
                    by the host with a hard or soft reset.
Total time to complete Offline 
data collection:         (    0) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 484) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:            (0x30a5)    SCT Status supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   076   064   006    Pre-fail  Always       -       42388043
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       14
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   092   060   045    Pre-fail  Always       -       1771659284
  9 Power_On_Hours          0x0032   087   087   000    Old_age   Always       -       11613 (120 16 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       14
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   058   051   040    Old_age   Always       -       42 (Min/Max 37/49)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       391
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       489
194 Temperature_Celsius     0x0022   042   049   000    Old_age   Always       -       42 (0 28 0 0 0)
195 Hardware_ECC_Recovered  0x001a   076   064   000    Old_age   Always       -       42388043
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       11583h+30m+13.306s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       48053204564
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       53244477691

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Interrupted (host reset)      90%     11612         -
# 2  Conveyance offline  Completed without error       00%     11554         -
# 3  Short offline       Completed without error       00%     11553         -
# 4  Short offline       Completed without error       00%     11547         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


smartctl -a /dev/ada2
Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Serial Number:    ZFN16GZ3
LU WWN Device Id: 5 000c50 0b0136815
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Oct 14 20:42:35 2020 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (  41)    The self-test routine was interrupted
                    by the host with a hard or soft reset.
Total time to complete Offline 
data collection:         (    0) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 491) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:            (0x30a5)    SCT Status supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   076   064   006    Pre-fail  Always       -       42390765
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       16
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   092   060   045    Pre-fail  Always       -       1678812342
  9 Power_On_Hours          0x0032   088   088   000    Old_age   Always       -       11385 (251 39 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       15
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   060   051   040    Old_age   Always       -       40 (Min/Max 38/49)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       374
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       481
194 Temperature_Celsius     0x0022   040   049   000    Old_age   Always       -       40 (0 29 0 0 0)
195 Hardware_ECC_Recovered  0x001a   076   064   000    Old_age   Always       -       42390765
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       11342h+59m+23.113s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       47796040612
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       49730244947

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Interrupted (host reset)      90%     11384         -
# 2  Conveyance offline  Completed without error       00%     11326         -
# 3  Short offline       Completed without error       00%     11326         -
# 4  Short offline       Completed without error       00%     11319         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


smartctl -a /dev/ada3
Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST4000DM004-2CV104
Serial Number:    ZFN166VD
LU WWN Device Id: 5 000c50 0b01530d0
Firmware Version: 0001
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Oct 14 20:42:42 2020 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (  41)    The self-test routine was interrupted
                    by the host with a hard or soft reset.
Total time to complete Offline 
data collection:         (    0) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 487) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:            (0x30a5)    SCT Status supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   076   064   006    Pre-fail  Always       -       42839939
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       17
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   092   060   045    Pre-fail  Always       -       1713849547
  9 Power_On_Hours          0x0032   087   087   000    Old_age   Always       -       11815 (187 221 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       17
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       0 0 1
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   060   050   040    Old_age   Always       -       40 (Min/Max 37/48)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       408
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       503
194 Temperature_Celsius     0x0022   040   050   000    Old_age   Always       -       40 (0 29 0 0 0)
195 Hardware_ECC_Recovered  0x001a   076   064   000    Old_age   Always       -       42839939
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       11786h+21m+35.577s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       48285408849
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       56770760577

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Interrupted (host reset)      90%     11814         -
# 2  Conveyance offline  Completed without error       00%     11756         -
# 3  Short offline       Completed without error       00%     11756         -
# 4  Short offline       Completed without error       00%     11749         -
# 5  Extended offline    Interrupted (host reset)      80%     11718         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
From what you have posted, ada0 is the only disk showing errors (UDMA_CRC) which point to cabling or the SATA controller. That disk also has almost double the start/stops of the others... is there a power problem there?

All of your extended tests (you should be doing smartctl -t long /dev/xxxx) haven't finished and/or were interrupted... you need to let them run, usually around 20 minutes before checking back with smartctl -a.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
OP can't complete any SMART tests because the machine is rebooting before it can complete.

But here's the real problem:

ST4000DM004

Those are all SMR or "shingled" drives. I imagine it's been working fine and you've just recently hit the point of your drives having to start a reshingle (Read-modify-write) operation for each new update/transaction group.

Pathologically, SMR drives are bad for ZFS, because the copy-on-write nature of the latter (and the need to write into empty space) means it aggravates the shingling behavior.

Have you been throwing a lot of writes at this recently?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Those are all SMR or "shingled" drives
Good catch, I don't know Seagate drives very well and didn't even think to check that (all the SMR noise has been around WD Reds).
 

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
From what you have posted, ada0 is the only disk showing errors (UDMA_CRC) which point to cabling or the SATA controller. That disk also has almost double the start/stops of the others... is there a power problem there?
Anything is possible.. the PSU is 7 years old now, it's possible that it could be dying.

Have you been throwing a lot of writes at this recently?
Nothing out of the usual. Most of this pool is long term archival storage for media or documents.
 

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
1602773482927.png

1602773505761.png

nothing far out of the usual
 

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
Coming up on 44 hours for a long self test on /dev/ada0 at 70% remaining. Should long self tests on a 4tb drive really take this long?
 

c77dk

Patron
Joined
Nov 27, 2019
Messages
468
Coming up on 44 hours for a long self test on /dev/ada0 at 70% remaining. Should long self tests on a 4tb drive really take this long?
It doesn't sound right to me - my 6 and 8TBs are way faster than that - so unless you have heavy IO at the same time I would be worried
 

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
It doesn't sound right to me - my 6 and 8TBs are way faster than that - so unless you have heavy IO at the same time I would be worried
What else could be the cause? I'm at 97 hours uptime right now with 30% remaining on /dev/ada0 now. I could use some pointers for what to look into next as a possible culprit for this
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
What does the output of gstat -dp look like? Are your drives furiously clicking away when they should otherwise be idle?

Shingled drives slow down terrifically when they've reached the point of "no empty/clean sectors" and having to overwrite previous data - even ZFS's regular transaction/metadata updates can cause them to have to do the equivalent of a read-modify-write on a 256MiB chunk of data.
 

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
Shingled drives slow down terrifically when they've reached the point of "no empty/clean sectors" and having to overwrite previous data - even ZFS's regular transaction/metadata updates can cause them to have to do the equivalent of a read-modify-write on a 256MiB chunk of data.
These drives aren't full; the array is only at 49% capacity

Here's 10 seconds of gstat -dp

Code:
dT: 0.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.010s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0      2      0      0    0.0      1     63    0.5      0      0    0.0    6.4  ada0
    0      2      0      0    0.0      1     63    0.5      0      0    0.0   33.1  ada1
    0      2      0      0    0.0      1     63    0.6      0      0    0.0    2.3  ada2
    0      2      0      0    0.0      1     63    0.6      0      0    0.0    6.0  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.009s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    1      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  ada0
    0     28      0      0    0.0     28    741    0.4      0      0    0.0    1.1  ada1
    0     28      0      0    0.0     28    741    0.5      0      0    0.0    1.3  ada2
    0     25      0      0    0.0     25    745    0.5      0      0    0.0    1.3  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.008s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0    107      0      0    0.0    105   1890    0.7      0      0    0.0   77.2  ada0
    0     94      0      0    0.0     92   1127    0.3      0      0    0.0   94.6  ada1
    0     89      0      0    0.0     87   1116    0.4      0      0    0.0   63.8  ada2
    0     90      0      0    0.0     88   1116    0.4      0      0    0.0   85.5  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.006s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0      2      0      0    0.0      1     64    0.6      0      0    0.0    6.1  ada0
    0      2      0      0    0.0      1     64    0.6      0      0    0.0    1.0  ada1
    0      2      0      0    0.0      1     64    0.6      0      0    0.0    1.3  ada2
    1      1      0      0    0.0      1     64    0.6      0      0    0.0    0.1  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.002s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    7     12      2    128   75.4      8    511    0.9      0      0    0.0   23.9  ada0
    0     17      0      0    0.0     14    894    1.3      0      0    0.0   12.7  ada1
    0     19      2    128   80.4     14    894  191.2      0      0    0.0   71.7  ada2
    0     18      0      0    0.0     14    894    1.2      0      0    0.0   56.5  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.017s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0     16      0      0    0.0      8    504  658.6      0      0    0.0   97.4  ada0
    1      8      0      0    0.0      2    126  151.6      0      0    0.0   38.7  ada1
    1      8      0      0    0.0      2    126  126.6      0      0    0.0   33.7  ada2
    0      9      0      0    0.0      2    126  198.1      0      0    0.0   54.3  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.011s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    1      3      0      0    0.0      3    190   13.1      0      0    0.0    3.8  ada0
    0      5      0      0    0.0      3    190    0.6      0      0    0.0   10.4  ada1
    0      5      0      0    0.0      3    190    0.7      0      0    0.0   10.7  ada2
    0      4      0      0    0.0      3    190   11.7      0      0    0.0    8.5  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.007s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    3     66      0      0    0.0     63   2947    2.6      0      0    0.0  113.1  ada0
    3     63      0      0    0.0     61   2927    2.0      0      0    0.0    9.6  ada1
    3     59      0      0    0.0     57   2935    2.4      0      0    0.0   10.7  ada2
    3     15      0      0    0.0     13    616  251.3      0      0    0.0   91.0  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
dT: 1.004s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    2    101      0      0    0.0     99   1728    0.8      0      0    0.0    8.1  ada0
    0     94      0      0    0.0     90   1764    7.1      0      0    0.0   80.3  ada1
    0     91      0      0    0.0     87   1768    7.2      0      0    0.0   71.3  ada2
    2    127      0      0    0.0    125   4134    1.1      0      0    0.0   13.2  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
These drives aren't full; the array is only at 49% capacity
It's not a matter of logical used capacity - it's a matter of "does the physical section of the disk contain non-zero bits" - the DM-SMR (drive-managed SMR) is adding another layer of indirection to your writes, and ZFS has no visibility into it.

In your gstat snippet you can see the interval here:

Code:
L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0     16      0      0    0.0      8    504  658.6      0      0    0.0   97.4  ada0
    1      8      0      0    0.0      2    126  151.6      0      0    0.0   38.7  ada1
    1      8      0      0    0.0      2    126  126.6      0      0    0.0   33.7  ada2
    0      9      0      0    0.0      2    126  198.1      0      0    0.0   54.3  ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0  da0

Your drives are taking literally hundreds of milliseconds to write.

New writes to clean sections of drive on SMR are fine, just as fast as non-SMR. But overwrites are the problem - the SMR drive needs to "erase" in typically a 256MiB chunk. So if you need to update 4KiB of data in the middle of it, the drive has to read in 256MiB, twiddle those four-thousand bits, and then write that 256MiB zone back to disk. That's a write amplification of 65,536x, and that's what I believe is happening here.

Further to this - SMR drives band-aid this problem by having a traditional-style recording "buffer" or "cache" zone - this is used both for quickly writing new data that arrives as small chunks as well as perform the re-shingling operation described above. This zone is typically located at the outer edge of the disk (for the fastest access) and varies in size by device/manufacturer, but it is on the scope of a couple dozen GiB usually. When the drive isn't doing anything else, it will try to de-stage that cached area (and perform re-shingling) in order to make more fully clear SMR zones, and as long as you aren't trying to make new writes it's not that bad. But if the drive gets caught in the middle of a reshingle and ZFS asks for a new write, the drive may need to move the write heads from whatever it was doing previously, over to the cache zone to write that new data, then over for a read, then try to get back to reshingling. It gets worse and worse as the cache gets fuller and fuller, until the drive can only handle a handful of operations per second. And in your configuration, you have four drives; four times as many opportunities to catch a drive at a bad time (Bonus edit: And in a Z2, any one drive that drags butt will stall out the rest of the vdev.)
 
Last edited:

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
Man, that is truly terrible, and I understand completely now... this likely hadn't manifested until the drives ran out of non-zero bits... considering I expanded an existing array with these disks about a bit over a year ago... maybe the problem is finally coming to fruition.

Is there a way to temporarily increase the amount of time that ZFS considers a vdev "hanging"? I expect that the only solution is replacing every drive again, but I'd like to keep the machine up and running, albeit in its not-so-performant state... I'm going to have to resilver the whole array over time after I select new drives.

As your signature makes you the self proclaimed expert on "don't use SMR drives", have you heard of anyone successfully returning these "not disclosed" SMR drives to the manufacturer for a credit or replacement? I feel a bit like I got baited and switched here
 

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
doing some research, I thought that setting the zpool failmode would prevent the kernel panic from a vdev hang, but it is already set to continue for that pool
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I'm not happy about the sneaky introduction of SMR into the consumer mainstream without disclosure. Western Digital has gotten the lion's share of the bad press recently for putting it in their WD Red "NAS" drives, but Seagate was the first to do this as you're unfortunately finding out now, with the Barracuda line before rebranding as "Archive" drives.

zpool failmode controls the behavior of ZFS on the pool going UNAVAIL - unfortunately in this case it's a different set of tunables going off.

The flags you want are under the following:

Code:
vfs.zfs.deadman_enabled:      Kernel panic on stalled ZFS I/O (Default 1 for TRUE)
vfs.zfs.deadman_checktime_ms: Period of checks for stalled ZFS I/O in milliseconds (Default 5,000ms)
vfs.zfs.deadman_synctime_ms:  Stalled ZFS I/O expiration time in milliseconds (Default 1,000,000ms)


The easy solution is to set the vfs.zfs.deadman_enabled=0 and hope that it just soldiers on through.

have you heard of anyone successfully returning these "not disclosed" SMR drives to the manufacturer for a credit or replacement? I feel a bit like I got baited and switched here

Many users have had success returning their SMR WD Reds by opening a ticket. Given that Seagate tooted their own horn in the past about "SMR isn't good for NAS" you may be able to submit a support ticket for that if these drives are within warranty. Try to get them to give you Barracuda Pro drives (ST4000DM0004) from this list:


Try a line like "these drives were purchased before this list was publicly available, and I would not have knowingly bought the SMR versions if it had been disclosed." More flies with honey than vinegar, and all that jazz.
 

snicker

Dabbler
Joined
Dec 9, 2013
Messages
10
Thanks for all the assistance HB, this all starts to make a lot of sense now, especially considering how long it seemed to take to resilver these drives, now that I think about it...

I have already sent a message to Seagate about replacing the drives, will see what they say and request the Pros as a replacement (or a credit). I'll update the thread as information comes in. Unfortunately the one SN I checked just came out of warranty on Aug22.. I'm going to try my best to get them replaced.

20% to go on my long SMART self test, 117hours in. No panic yet, but I did set the tunable for the next boot.
 
Top