[SOLVED] Task seems blocked: (performance down to near 0%)

dakarx

Cadet
Joined
Feb 11, 2019
Messages
5
My Prod server started acting poorly yesterday (newly built 6 months ago), the GUI would hang for minutes at a time, NFS mounts would stall, command line would hang for 60 seconds or more, these events were continual repeating every 2-5 minutes or so. Leaving the system pretty much unusable. This was running 12.0-u3. (system details should be in signature). No hardware changes have been made in months, last configuration change was weeks ago (added new replication tasks to backup NAS).

middlewared.log
Code:
[2021/06/02 08:11:27] (WARNING) middlewared._loop_monitor_thread():1418 - Task seems blocked:
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/disk_/smartctl.py", line 69, in smartctl
    cp = await smartctl(smartctl_args + args, check=False, stderr=subprocess.STDOUT,
  File "/usr/local/lib/python3.9/site-packages/middlewared/common/smart/smartctl.py", line 104, in smartctl
    return await run(["smartctl"] + args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/__init__.py", line 53, in run
    proc = await asyncio.create_subprocess_exec(*args, **kwargs)
  File "/usr/local/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1661, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/usr/local/lib/python3.9/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/usr/local/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.9/subprocess.py", line 1777, in _execute_child
    part = os.read(errpipe_read, 50000)

[2021/06/02 08:16:22] (WARNING) middlewared._loop_monitor_thread():1418 - Task seems blocked:
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/disk_/smartctl.py", line 69, in smartctl
    cp = await smartctl(smartctl_args + args, check=False, stderr=subprocess.STDOUT,
  File "/usr/local/lib/python3.9/site-packages/middlewared/common/smart/smartctl.py", line 104, in smartctl
    return await run(["smartctl"] + args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/__init__.py", line 53, in run
    proc = await asyncio.create_subprocess_exec(*args, **kwargs)
  File "/usr/local/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1661, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/usr/local/lib/python3.9/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/usr/local/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.9/subprocess.py", line 1777, in _execute_child
    part = os.read(errpipe_read, 50000)

[2021/06/02 08:17:09] (WARNING) middlewared._loop_monitor_thread():1418 - Task seems blocked:
  File "/usr/local/lib/python3.9/site-packages/middlewared/alert/source/ipmi_sel.py", line 151, in check
    return await self._produce_alerts_for_ipmitool_output(await ipmitool(["-c", "sel", "elist"]))
  File "/usr/local/lib/python3.9/site-packages/middlewared/alert/source/ipmi_sel.py", line 28, in __call__
    result = await run(["ipmitool"] + args, check=False, encoding="utf8", errors="ignore")
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/__init__.py", line 53, in run
    proc = await asyncio.create_subprocess_exec(*args, **kwargs)
  File "/usr/local/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1661, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/usr/local/lib/python3.9/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/usr/local/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.9/subprocess.py", line 1777, in _execute_child
    part = os.read(errpipe_read, 50000)

^C[2021/06/02 08:21:25] (WARNING) middlewared._loop_monitor_thread():1418 - Task seems blocked:
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/disk_/smartctl.py", line 69, in smartctl
    cp = await smartctl(smartctl_args + args, check=False, stderr=subprocess.STDOUT,
  File "/usr/local/lib/python3.9/site-packages/middlewared/common/smart/smartctl.py", line 104, in smartctl
    return await run(["smartctl"] + args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/__init__.py", line 53, in run
    proc = await asyncio.create_subprocess_exec(*args, **kwargs)
  File "/usr/local/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1661, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/usr/local/lib/python3.9/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/usr/local/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.9/subprocess.py", line 1777, in _execute_child
    part = os.read(errpipe_read, 50000)

[2021/06/02 08:22:09] (WARNING) middlewared._loop_monitor_thread():1418 - Task seems blocked:
  File "/usr/local/lib/python3.9/site-packages/middlewared/alert/source/ipmi_sel.py", line 151, in check
    return await self._produce_alerts_for_ipmitool_output(await ipmitool(["-c", "sel", "elist"]))
  File "/usr/local/lib/python3.9/site-packages/middlewared/alert/source/ipmi_sel.py", line 28, in __call__
    result = await run(["ipmitool"] + args, check=False, encoding="utf8", errors="ignore")
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/__init__.py", line 53, in run
    proc = await asyncio.create_subprocess_exec(*args, **kwargs)
  File "/usr/local/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1661, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/usr/local/lib/python3.9/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/usr/local/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.9/subprocess.py", line 1777, in _execute_child
    part = os.read(errpipe_read, 50000)

[2021/06/02 08:22:18] (WARNING) middlewared._loop_monitor_thread():1418 - Task seems blocked:
  File "/usr/local/lib/python3.9/site-packages/middlewared/alert/source/ipmi_sel.py", line 214, in check
    return self._produce_alert_for_ipmitool_output(await ipmitool(["sel", "info"]))
  File "/usr/local/lib/python3.9/site-packages/middlewared/alert/source/ipmi_sel.py", line 28, in __call__
    result = await run(["ipmitool"] + args, check=False, encoding="utf8", errors="ignore")
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/__init__.py", line 53, in run
    proc = await asyncio.create_subprocess_exec(*args, **kwargs)
  File "/usr/local/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1661, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/usr/local/lib/python3.9/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/usr/local/lib/python3.9/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/usr/local/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.9/subprocess.py", line 1777, in _execute_child
    part = os.read(errpipe_read, 50000)


This started running version 12.0-u3, hoping it was just a fixed bug, eventually I was able to upgrade to 12.0-u4-1 (which took 2 hours to complete). No change.
Reverted back to 12.0-u3 - no change.
Disabled SMART Service - no change
Removed SMART tasks - no change
Removed the LOG drive from the pool, no change, even physically removed it, no change. (SMART was not collecting data on this drive because of a data format error)

dmsg output;
Code:
---<<BOOT>>---
Copyright (c) 1992-2020 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.2-RELEASE-p6 df578562304(HEAD) TRUENAS amd64
FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
VT(vga): text 80x25
CPU: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz (2399.37-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 51539607552 (49152 MB)
avail memory = 49922826240 (47610 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <HP     ProLiant>
FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs
FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 hardware threads
random: unblocking device.
Firmware Warning (ACPI): Invalid length for FADT/Pm1aControlBlock: 32, using default 16 (20200430/tbfadt-850)
Firmware Warning (ACPI): Invalid length for FADT/Pm2ControlBlock: 32, using default 8 (20200430/tbfadt-850)
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic0 <Version 2.0> irqs 0-23 on motherboard
Launching APs: 1 14 4 8 5 3 15 6 7 10 2 9 12 13 11
Timecounter "TSC-low" frequency 1199685090 Hz quality 1000
random: entropy device external interface
kbd1 at kbdmux0
mlx5en: Mellanox Ethernet driver 3.5.2 (September 2019)
nexus0
vtvga0: <VT VGA driver> on motherboard
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS> on motherboard
padlock0: No ACE support.
cryptosoft0: <software crypto> on motherboard
acpi0: <HP ProLiant> on motherboard
acpi0: Power Button (fixed)
acpi0: _OSC failed: AE_BUFFER_OVERFLOW
cpu0: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 350
Event timer "HPET1" frequency 14318180 Hz quality 340
Event timer "HPET2" frequency 14318180 Hz quality 340
Event timer "HPET3" frequency 14318180 Hz quality 340
atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
apei0: <ACPI Platform Error Interface> on acpi0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x908-0x90b on acpi0
pcib0: <ACPI Host-PCI bridge> on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
ciss0: <HP Smart Array P410i> port 0x4000-0x40ff mem 0xfba00000-0xfbbfffff,0xfb9f0000-0xfb9f0fff irq 28 at device 0.0 on pci1
ciss0: PERFORMANT Transport
pcib2: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 3.0 on pci0
pci3: <ACPI PCI bus> on pcib3
mps0: <Avago Technologies (LSI) SSS6200> port 0x5000-0x50ff mem 0xfbdf0000-0xfbdfffff,0xfbd80000-0xfbdbffff irq 24 at device 0.0 on pci3
mps0: Firmware: 113.00.00.00, Driver: 21.02.00.00-fbsd
mps0: IOCCapabilities: 984c<ScsiTaskFull,DiagTrace,EEDP,TransRetry,IR,MSIXIndex>
pcib4: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 5.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci6: <ACPI PCI bus> on pcib6
pcib7: <ACPI PCI-PCI bridge> at device 7.0 on pci0
pci7: <ACPI PCI bus> on pcib7
pcib8: <ACPI PCI-PCI bridge> at device 8.0 on pci0
pci8: <ACPI PCI bus> on pcib8
pcib9: <ACPI PCI-PCI bridge> at device 9.0 on pci0
pci9: <ACPI PCI bus> on pcib9
mps1: <Avago Technologies (LSI) SAS2008> port 0x6000-0x60ff mem 0xfbef0000-0xfbef3fff,0xfbe80000-0xfbebffff irq 32 at device 0.0 on pci9
mps1: Firmware: 18.00.00.00, Driver: 21.02.00.00-fbsd
mps1: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>
pcib10: <ACPI PCI-PCI bridge> at device 10.0 on pci0
pci10: <ACPI PCI bus> on pcib10
em0: <Intel(R) PRO/1000 Network Connection> port 0x7000-0x701f mem 0xfbfe0000-0xfbffffff,0xfbfc0000-0xfbfdffff irq 33 at device 0.0 on pci10
em0: Using 1024 TX descriptors and 1024 RX descriptors
em0: Using an MSI interrupt
em0: Ethernet address: 00:26:55:e8:44:8c
em1: <Intel(R) PRO/1000 Network Connection> port 0x7020-0x703f mem 0xfbfa0000-0xfbfbffff,0xfbf80000-0xfbf9ffff irq 31 at device 0.1 on pci10
em1: Using 1024 TX descriptors and 1024 RX descriptors
em1: Using an MSI interrupt
em1: Ethernet address: 00:26:55:e8:44:8d
pci0: <base peripheral, interrupt controller> at device 20.0 (no driver attached)
pci0: <base peripheral, interrupt controller> at device 20.1 (no driver attached)
pci0: <base peripheral, interrupt controller> at device 20.2 (no driver attached)
pcib11: <ACPI PCI-PCI bridge> at device 28.0 on pci0
pci11: <ACPI PCI bus> on pcib11
pcib12: <ACPI PCI-PCI bridge> at device 0.0 on pci11
pci12: <ACPI PCI bus> on pcib12
bge0: <HP NC326i Dual Port PCI-E Gigabit Adapter, ASIC rev. 0x009003> mem 0xfb8f0000-0xfb8fffff,0xfb8e0000-0xfb8effff irq 16 at device 4.0 on pci12
bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
miibus0: <MII bus> on bge0
brgphy0: <BCM5714 1000BASE-T media interface> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 9c:8e:99:19:b9:28
bge1: <HP NC326i Dual Port PCI-E Gigabit Adapter, ASIC rev. 0x009003> mem 0xfb8d0000-0xfb8dffff,0xfb8c0000-0xfb8cffff irq 17 at device 4.1 on pci12
bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
miibus1: <MII bus> on bge1
brgphy1: <BCM5714 1000BASE-T media interface> PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 9c:8e:99:19:b9:29
uhci0: <Intel 82801JI (ICH10) USB controller USB-A> port 0x1000-0x101f irq 20 at device 29.0 on pci0
usbus0 on uhci0
usbus0: 12Mbps Full Speed USB v1.0
uhci1: <Intel 82801JI (ICH10) USB controller USB-B> port 0x1020-0x103f irq 23 at device 29.1 on pci0
usbus1 on uhci1
usbus1: 12Mbps Full Speed USB v1.0
uhci2: <Intel 82801JI (ICH10) USB controller USB-C> port 0x1040-0x105f irq 22 at device 29.2 on pci0
usbus2 on uhci2
usbus2: 12Mbps Full Speed USB v1.0
uhci3: <Intel 82801JI (ICH10) USB controller USB-F> port 0x1060-0x107f irq 23 at device 29.3 on pci0
usbus3 on uhci3
usbus3: 12Mbps Full Speed USB v1.0
ehci0: <Intel 82801JI (ICH10) USB 2.0 controller USB-A> mem 0xfb5f0000-0xfb5f03ff irq 20 at device 29.7 on pci0
usbus4: EHCI version 1.0
usbus4 on ehci0
usbus4: 480Mbps High Speed USB v2.0
pcib13: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci13: <ACPI PCI bus> on pcib13
vgapci0: <VGA-compatible display> port 0x3000-0x30ff mem 0xf0000000-0xf7ffffff,0xfb7f0000-0xfb7fffff irq 23 at device 3.0 on pci13
vgapci0: Boot video device
uhci4: <HP iLO Standard Virtual USB controller> port 0x3800-0x381f irq 22 at device 4.4 on pci13
usbus5 on uhci4
usbus5: 12Mbps Full Speed USB v1.0
ipmi0: <IPMI System Interface> mem 0xfb6f0000-0xfb6f00ff irq 21 at device 4.6 on pci13
ipmi0: using KSC interface
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH10 SATA300 controller> port 0x10c0-0x10c7,0x10c8-0x10cb,0x10d0-0x10d7,0x10d8-0x10db,0x10e0-0x10ef,0x10f0-0x10ff irq 17 at device 31.5 on pci0
ata2: <ATA channel> at channel 0 on atapci0
ata3: <ATA channel> at channel 1 on atapci0
acpi_tz0: <Thermal Zone> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
acpi_syscontainer0: <System Container> port 0x2e-0x2f,0x620-0x65f,0x680-0x69f,0x600-0x61f,0x660-0x67f,0x300-0x31f on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
ichwd0: <Intel ICH10 watchdog timer> on isa0
ichwd0: ICH WDT present but disabled in BIOS or hardware
device_attach: ichwd0 attach returned 6
ichwd0: <Intel ICH10 watchdog timer> at port 0x930-0x937,0x960-0x97f on isa0
ichwd0: ICH WDT present but disabled in BIOS or hardware
device_attach: ichwd0 attach returned 6
orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xcf000-0xd2fff pnpid ORM0000 on isa0
uart1: <Non-standard ns8250 class UART with FIFOs> at port 0x2f8 irq 3 on isa0
coretemp0: <CPU On-Die Thermal Sensors> on cpu0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est0 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est1 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est2 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est3 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est4 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est5 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est6 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est7 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est8 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est9 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est10 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est11 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est12 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est13 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est14 attach returned 6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 13
device_attach: est15 attach returned 6
Timecounters tick every 1.000 msec
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
ipfw2 (+ipv6) initialized, divert enabled, nat enabled, default to accept, logging disabled
ugen1.1: <Intel UHCI root HUB> at usbus1
ugen5.1: <HP UHCI root HUB> at usbus5
ugen0.1: <Intel UHCI root HUB> at usbus0
ugen2.1: <Intel UHCI root HUB> at usbus2
ugen3.1: <Intel UHCI root HUB> at usbus3
ugen4.1: <Intel EHCI root HUB> at usbus4
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
uhub1: <HP UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
uhub4: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus4
uhub5: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ipmi0: IPMI device rev. 1, firmware rev. 2.33, version 2.0, device support mask 0xf
ipmi0: Number of channels 0
ipmi0: Attached watchdog
Trying to mount root from zfs:boot-pool/ROOT/12.0-U4-1 []...
Root mount waiting for: CAM usbus0 usbus1 usbus2 usbus3 usbus4 usbus5
uhub0: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub5: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub3: 2 ports with 2 removable, self powered
ugen5.2: <HP Virtual Keyboard> at usbus5
ukbd0 on uhub1
ukbd0: <Virtual Keyboard> on usbus5
kbd2 at ukbd0
ums0 on uhub1
Root mount waiting for: CAM usbus4 usbus5
ums0: <Virtual Mouse> on usbus5
ums0: 3 buttons and [XY] coordinates ID=0
Root mount waiting for: CAM usbus4
Root mount waiting for: CAM usbus4
uhub4: 8 ports with 8 removable, self powered
Root mount waiting for: CAM
ugen3.2: <Avocent Avocent DSRIQ-USB> at usbus3
ukbd1 on uhub3
ukbd1: <EP1 Interrupt> on usbus3
kbd3 at ukbd1
ums1 on uhub3
ums1: <EP2 Interrupt> on usbus3
ums1: 5 buttons and [XYZ] coordinates ID=1
Root mount waiting for: CAM
Root mount waiting for: CAM
ada0 at ata3 bus 0 scbus5 target 0 lun 0
ada0: <Crucial CT240M500SSD1 MU03> ACS-2 ATA SATA 3.x device
ada0: Serial Number 14100C06B003
ada0: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
ada0: 228936MB (468862128 512 byte sectors)
ada0: quirks=0x2<NCQ_TRIM_BROKEN>
da0 at mps0 bus 0 scbus2 target 0 lun 0
da0: <LSI NWD-BLP4-800 0002> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number 2622235072208845886
da0: 150.000MB/s transfers
da0: Command Queueing enabled
da0: 762939MB (1562499584 512 byte sectors)
da4 at mps1 bus 0 scbus3 target 9 lun 0
da4: <ATA HGST HUH728060AL T7J0> Fixed Direct Access SPC-4 SCSI device
da4: Serial Number 2RG6J0VX           
da4: 600.000MB/s transfers
da4: Command Queueing enabled
da4: 5723166MB (11721045168 512 byte sectors)
da1 at mps1 bus 0 scbus3 target 6 lun 0
da1: <ATA HUS726060ALE610 TDR2> Fixed Direct Access SPC-4 SCSI device
da1: Serial Number K8K98N4N
da1: 600.000MB/s transfers
da1: Command Queueing enabled
da1: 5723166MB (11721045168 512 byte sectors)
da2 at mps1 bus 0 scbus3 target 7 lun 0
da2: <ATA HGST HUS726060AL TD05> Fixed Direct Access SPC-4 SCSI device
da2: Serial Number K8J1V6HD           
da2: 600.000MB/s transfers
da2: Command Queueing enabled
da2: 5723166MB (11721045168 512 byte sectors)
da3 at mps1 bus 0 scbus3 target 8 lun 0
da3: <ATA HGST HUS726060AL TD05> Fixed Direct Access SPC-4 SCSI device
da3: Serial Number K8J251UD           
da3: 600.000MB/s transfers
da3: Command Queueing enabled
da3: 5723166MB (11721045168 512 byte sectors)
da5 at mps1 bus 0 scbus3 target 11 lun 0
da5: <ATA HGST HUS726060AL TD05> Fixed Direct Access SPC-4 SCSI device
da5: Serial Number K8GKN14D           
da5: 600.000MB/s transfers
da5: Command Queueing enabled
da5: 5723166MB (11721045168 512 byte sectors)
lo0: link state changed to UP
em0: link state changed to UP
em1: link state changed to UP
GEOM_MIRROR: Device mirror/swap0 launched (2/2).
GEOM_MIRROR: Device mirror/swap1 launched (2/2).
GEOM_ELI: Device mirror/swap0.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: hardware
GEOM_ELI: Device mirror/swap1.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: hardware
hwpmc: SOFT/16/64/0x67<INT,USR,SYS,REA,WRI> TSC/1/64/0x20<REA> IAP/4/48/0x3ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA,PRC> IAF/3/48/0x67<INT,USR,SYS,REA,WRI> UCP/8/48/0x3f8<EDG,THR,REA,WRI,INV,QUA,PRC> UCF/1/48/0x60<REA,WRI>
em0: link state changed to DOWN
em1: link state changed to DOWN
bge0: link state changed to DOWN
em0: link state changed to UP
em1: link state changed to UP
lagg0: link state changed to UP
CPU: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz (2399.37-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features3=0x9c000000<IBPB,STIBP,L1DFL,SSBD>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
nfsd: can't register svc name
Security policy loaded: MAC/ntpd (mac_ntpd)
bridge0: Ethernet address: 02:2d:4c:97:d4:00
epair0a: Ethernet address: 02:7d:a5:36:20:0a
epair0b: Ethernet address: 02:7d:a5:36:20:0b
epair0a: link state changed to UP
epair0b: link state changed to UP
epair0a: changing name to 'vnet0.1'
em0: promiscuous mode enabled
em1: promiscuous mode enabled
lagg0: promiscuous mode enabled
em0: link state changed to DOWN
em1: link state changed to DOWN
bridge0: link state changed to UP
lagg0: link state changed to DOWN
vnet0.1: promiscuous mode enabled
em1: link state changed to UP
lo0: link state changed to UP
em0: link state changed to UP
lagg0: link state changed to UP
 

dakarx

Cadet
Joined
Feb 11, 2019
Messages
5
Update: Hoping for a corrupted install/update or maybe a failing SSD. I grabbed a new SSD, performed a clean install of 12.0-u4. I was able to import the configuration over and my ZFS volume mounted right up....

No change! it still acts like it's totally I/O bound.

and throwing the same error;
Code:
[2021/06/02 12:55:02] (WARNING) middlewared._loop_monitor_thread():1418 - Task seems blocked:
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1241, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1198, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 973, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/datastore/read.py", line 164, in query
    result = await self._queryset_serialize(
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/datastore/read.py", line 214, in _queryset_serialize
    result.append(await self._serialize(
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/datastore/read.py", line 232, in _serialize
    data = await self.middleware.call(extend, data)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1241, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1198, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/account.py", line 144, in user_extend
    if os.path.exists(keysfile):
  File "/usr/local/lib/python3.9/genericpath.py", line 19, in exists
    os.stat(path)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Is the pool full?

What's the output of gstat show as being a bottleneck?

etc.
 

dakarx

Cadet
Joined
Feb 11, 2019
Messages
5
Pool is running around 60%
Code:
root@nas1[/var/log]# zpool iostat -v
                                                  capacity     operations     bandwidth
pool                                            alloc   free   read  write   read  write
----------------------------------------------  -----  -----  -----  -----  -----  -----
boot-pool                                       1.40G   102G      1      0  16.3K  2.01K
  ada0p2                                        1.40G   102G      1      0  16.3K  2.01K
----------------------------------------------  -----  -----  -----  -----  -----  -----
vol1                                            16.4T  10.8T      8     63  86.6K  1.01M
  raidz1                                        16.4T  10.8T      8     61  86.6K   910K
    gptid/72d98e02-a7bb-11eb-a45a-002655e8448c      -      -      1     13  15.9K   182K
    gptid/73497d95-a7bb-11eb-a45a-002655e8448c      -      -      1     12  16.2K   182K
    gptid/73dcb8c6-a7bb-11eb-a45a-002655e8448c      -      -      1     12  18.4K   182K
    gptid/73fd2b86-a7bb-11eb-a45a-002655e8448c      -      -      1     12  18.1K   182K
    gptid/749d8620-a7bb-11eb-a45a-002655e8448c      -      -      1     12  18.1K   182K
logs                                                -      -      -      -      -      -
  gptid/ef53adef-c340-11eb-b4d7-002655e8448c    1.17M   744G      0      2     65   126K
----------------------------------------------  -----  -----  -----  -----  -----  -----


there is hardly anything in use I've disconnected most clients, this is the highest measurements I could capture

Code:
root@nas1[/var/log]# gstat
dT: 1.011s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0| ada0
    0      1      0      0    0.0      0      0    0.0    0.0| da0
    1     46      0      0    0.0     46    499    2.0    9.2| da1
    1     43      0      0    0.0     43    491    2.2    9.4| da2
    0     45      0      0    0.0     44    510    2.0    9.4| da3
    0     50      0      0    0.0     49    534    0.3    2.9| da4
    1     46      0      0    0.0     46    530    2.7   12.4| da5
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p1
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p2
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p3
    0      0      0      0    0.0      0      0    0.0    0.0| gptid/3185c1b7-c3c1-11eb-9efb-ac220bdcff63
    0      1      0      0    0.0      0      0    0.0    0.0| da0p1
    0      0      0      0    0.0      0      0    0.0    0.0| da1p1
    1     46      0      0    0.0     46    499    2.0    9.2| da1p2
    0      0      0      0    0.0      0      0    0.0    0.0| da2p1
    1     43      0      0    0.0     43    491    2.2    9.4| da2p2
    0      0      0      0    0.0      0      0    0.0    0.0| da3p1
    0     45      0      0    0.0     44    510    2.0    9.4| da3p2
    0      0      0      0    0.0      0      0    0.0    0.0| da4p1
    0     50      0      0    0.0     49    534    0.3    2.9| da4p2
    0      0      0      0    0.0      0      0    0.0    0.0| da5p1
    1     46      0      0    0.0     46    530    2.7   12.5| da5p2
    0      0      0      0    0.0      0      0    0.0    0.0| mirror/swap1
    0      0      0      0    0.0      0      0    0.0    0.0| gptid/3189f766-c3c1-11eb-9efb-ac220bdcff63
    0      1      0      0    0.0      0      0    0.0    0.0| gptid/ef53adef-c340-11eb-b4d7-002655e8448c
    1     46      0      0    0.0     46    499    2.0    9.2| gptid/73497d95-a7bb-11eb-a45a-002655e8448c
    0      0      0      0    0.0      0      0    0.0    0.0| mirror/swap0.eli
    1     43      0      0    0.0     43    491    2.2    9.4| gptid/73dcb8c6-a7bb-11eb-a45a-002655e8448c
    0     45      0      0    0.0     44    510    2.0    9.4| gptid/73fd2b86-a7bb-11eb-a45a-002655e8448c
    0     50      0      0    0.0     49    534    0.3    2.9| gptid/72d98e02-a7bb-11eb-a45a-002655e8448c
    0      0      0      0    0.0      0      0    0.0    0.0| mirror/swap1.eli
    1     46      0      0    0.0     46    530    2.7   12.5| gptid/749d8620-a7bb-11eb-a45a-002655e8448c
    0      0      0      0    0.0      0      0    0.0    0.0| gptid/72ed00e4-a7bb-11eb-a45a-002655e8448c
    0      0      0      0    0.0      0      0    0.0    0.0| mirror/swap0


load average stays under 1
Code:
last pid:  4030;  load averages:  0.79,  0.60,  0.53                                                                                up 0+02:09:35  14:39:49
64 processes:  1 running, 63 sleeping
CPU:  0.0% user,  0.0% nice,  5.2% system,  0.0% interrupt, 94.7% idle
Mem: 1396M Active, 1512M Inact, 6523M Wired, 38G Free
ARC: 2838M Total, 853M MFU, 1192M MRU, 1664K Anon, 17M Header, 774M Other
     1583M Compressed, 2628M Uncompressed, 1.66:1 Ratio
Swap: 4096M Total, 4096M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
 1441 root          4  20    0    11M  2408K vlruwk  12   2:55   3.90% nfsd
  460 root         39  20    0   511M   333M kqread   1   3:17   1.41% python3.9
 4030 root          1  20    0   248K   100K vlruwk   7   0:00   0.92% python3.9
 1623 avahi         1  20    0    13M  4188K select   1   0:03   0.13% avahi-daemon
 4028 root          1  20    0    13M  3948K CPU9     9   0:00   0.09% top
 1391 root          1 -52   r0    11M    11M nanslp   1   0:01   0.06% watchdogd
 2298    983       32  29    9  1457M   735M uwait   11   7:18   0.03% syncthing
 1617 root         11  20    0   365M   329M nanslp   9   0:34   0.02% collectd
 2352 root          1  20    0    19M  9172K select   7   0:00   0.01% sshd
 

dakarx

Cadet
Joined
Feb 11, 2019
Messages
5
Updated the firmware on my LSI HBA to the latest IT mode (it was a couple versions behind and in IR mode) - no difference..
NFS still transferring at a paltry < 2M read/write speed.

iperf tests to/from the server run consistently at 980Mbps.

I'm running out of things to throw at it before giving up and switching to my backup server to try and start over
(but that will have to wait until next week when my 10G switch/NICs arrive)
 

dakarx

Cadet
Joined
Feb 11, 2019
Messages
5
This can likely be closed.

The culprit was the SyncThing Jail, anytime the jail is running those "Task seems blocked" errors show up and performance goes in the tank, and disappear when the jail is shut down. I was able to instantly stop the errors in the log as soon as the jail was fully shut down, and they appeared as soon as it was running again. This holds true when the jail is running and the app is 'shutdown' from the web UI. Apparently I wasn't able to fully stop the jail from starting because of the system lag, the UI reported it was off, but it wasn't (I never saw the state change in the log file). As soon as it was fully shut down NFS data transfers were back up to 'reporting' > 120MB/s, saturating the client 1G link.

I don't know if it's a problem with the app, the jail, or perhaps the datasets that I was sync'ing. Leaning towards a combination of things. I was able to replicate the problem on my backup NAS server as well. Plex and other jails are running without issue. Why it decided to wait 4 months or so before going sideways.....who knows.

Spun up the Syncthing app in a VM and mounted the data directory over NFS.

Everything is currently behaving as expected.
 
Top