TrueNas seems to hang as a result of a failing drive

sirforce

Cadet
Joined
Oct 11, 2022
Messages
3
Hi - I have a TrueNas-13.0-U2 (e0947d373e) installed with 8 HDD and 2 SSD and mirrored USB boot disks.

Daily, When backups are occurring, the backup copy reports a failure - lost connection to the file share served by TrueNas. Attempting to access the web interface or a running virtual machine via RDP - all seems unresponsive. I am able to press the computer power button and the system seems to shut down without issue - not a hard power off but rather an orderly shutdown. I have not connected a monitor to see what was on the screen of the TrueNas host when this issue occurs.

ST12000VN0008-2YS101 /dev/ada5 seems to be the drive having issues reported by the notifications/alerts about bad sectors. I've now taken this drive offline to see if it remedies the hangs in the meantime as I wait for a replacement drive to be delivered.

Once powered down, I can then press the power button and the system will boot and everything seems to be working until the next day. I will know more tomorrow if the system remains functional and the backups all occur without error or TrueNas hangs.

The same condition occurred a few weeks back when a different drive failed and was later replaced. I am not sure if my configuration maybe needs a tweak or something else.

Looking for some guidance to avoid the TrueNas system hanging when a drive fails.

Here are some details of my system configuration:

Code:
camcontrol devlist

<Samsung SSD 860 EVO 500GB RVT04B6Q>  at scbus0 target 0 lun 0 (ada0,pass0)
<Samsung SSD 860 EVO 500GB RVT04B6Q>  at scbus1 target 0 lun 0 (ada1,pass1)
<ST12000VN0008-2PH103 SC61>        at scbus6 target 0 lun 0 (ada2,pass2)
<ST12000VN0008-2PH103 SC61>        at scbus7 target 0 lun 0 (ada3,pass3)
<ST12000VN0008-2YS101 SC60>        at scbus8 target 0 lun 0 (ada4,pass4)
<ST12000VN0008-2YS101 SC60>        at scbus9 target 0 lun 0 (ada5,pass5)
<ST12000VN0008-2PH103 SC61>        at scbus9 target 1 lun 0 (ada6,pass6)
<WDC WD121KFBX-68EF5N0 83.00A83>   at scbus9 target 2 lun 0 (ada7,pass7)
<WDC WD121KFBX-68EF5N0 83.00A83>   at scbus9 target 3 lun 0 (ada8,pass8)
<WDC WD121KFBX-68EF5N0 83.00A83>   at scbus9 target 4 lun 0 (ada9,pass9)
<Port Multiplier 5755197b 000e>    at scbus9 target 15 lun 0 (pass10,pmp0)
<USB SanDisk 3.2Gen1 1.00>         at scbus10 target 0 lun 0 (da0,pass11)
<USB SanDisk 3.2Gen1 1.00>         at scbus11 target 0 lun 0 (da1,pass12)

----------------------------
zpool list -v

NAME                                             SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
boot-pool                                         14G  3.87G  10.1G        -         -     1%    27%  1.00x    ONLINE  -
  mirror-0                                        14G  3.87G  10.1G        -         -     1%  27.6%      -    ONLINE
    da0p2                                           -      -      -        -         -      -      -      -    ONLINE
    da1p2                                           -      -      -        -         -      -      -      -    ONLINE
cavern                                          87.3T  20.2T  67.0T        -         -     0%    23%  1.00x  DEGRADED  /mnt
  raidz3-0                                      87.3T  20.2T  67.0T        -         -     0%  23.2%      -  DEGRADED
    gptid/fffc6d7c-0954-11ed-af8a-50ebf6cf6deb      -      -      -        -         -      -      -      -    ONLINE
    gptid/0083bf01-0955-11ed-af8a-50ebf6cf6deb      -      -      -        -         -      -      -      -    ONLINE
    gptid/0831b57f-3ae9-11ed-9644-50ebf6cf6deb      -      -      -        -         -      -      -      -    ONLINE
    gptid/00c93cc1-0955-11ed-af8a-50ebf6cf6deb      -      -      -        -         -      -      -      -   OFFLINE
    gptid/00d68d23-0955-11ed-af8a-50ebf6cf6deb      -      -      -        -         -      -      -      -    ONLINE
    gptid/00b3ad68-0955-11ed-af8a-50ebf6cf6deb      -      -      -        -         -      -      -      -    ONLINE
    gptid/00b533b9-0955-11ed-af8a-50ebf6cf6deb      -      -      -        -         -      -      -      -    ONLINE
    gptid/00b45f3e-0955-11ed-af8a-50ebf6cf6deb      -      -      -        -         -      -      -      -    ONLINE
poolVirtual                                      460G  61.8G   398G        -         -    13%    13%  1.00x    ONLINE  /mnt
  mirror-0                                       460G  61.8G   398G        -         -    13%  13.4%      -    ONLINE
    gptid/62e8f1db-d2a0-11ea-a2d6-d05099d333a0      -      -      -        -         -      -      -      -    ONLINE
    gptid/62ec6911-d2a0-11ea-a2d6-d05099d333a0      -      -      -        -         -      -      -      -    ONLINE

----------------------------
lspci

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:02.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 51)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 7
01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ee
01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] Device 43eb
01:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43e9
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
02:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
02:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
02:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea
06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
08:00.0 Network controller: Intel Corporation Wi-Fi 6 AX200 (rev 1a)
09:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev d8)
0a:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller
0a:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
0a:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
0a:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
0a:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller
 
Top