Can't access web gui or ssh.

xylofrut

Cadet
Joined
Sep 20, 2023
Messages
5
This is a remote system.
This system runs ok, and suddenly around day 40 i can no longer access web ui , and ssh hangs with a "traceback middleware" report.
Some services still running though. smb a jail and a vm still respond.Took 3 days for vm to get into an unresponding state and force us to press the reset button.
This happened twice.

ssh output:

Code:
FreeBSD 13.1-RELEASE-p7 n245428-4dfb91682c1 TRUENAS

        TrueNAS (c) 2009-2023, iXsystems, Inc.
        All rights reserved.
        TrueNAS code is released under the modified BSD license with some
        files copyrighted by (c) iXsystems, Inc.

        For more information, documentation, help or support, go here:
        http://truenas.com



Traceback (most recent call last):
  File "/usr/local/sbin/hactl", line 171, in <module>
    main(args.command, args.q)
  File "/usr/local/sbin/hactl", line 17, in main
    client = Client()
  File "/usr/local/lib/python3.9/site-packages/middlewared/client/client.py", line 286, in __init__


we pressed shutdown button and hope for truenas to get a clean shutdown, but system wasnt shutting down.
after resetting and booting succesfully i read midleware logs
and found this between crash and boot time period.
[2023/09/21 20:51:18] (ERROR) libzfs.query():477 - Failed to retrieve dataset handle for boot-pool: pool I/O is currently suspended

Code:
[2023/09/21 16:51:02] (DEBUG) freenasOS.Configuration.CheckFreeSpace():77 - CheckFreeSpace(path=/tmp/tmpqml3ls61.pem, pool=None, required=1028)
[2023/09/21 16:51:02] (DEBUG) freenasOS.Configuration.TryGetNetworkFile():745 - TryGetNetworkFile(['https://update-master.ixsystems.com/updates/ix_crl.pem']):  Read 1028 bytes total
[2023/09/21 17:51:01] (DEBUG) freenasOS.Configuration.TryGetNetworkFile():606 - TryGetNetworkFile(['https://update-master.ixsystems.com/updates/ix_crl.pem'])
[2023/09/21 17:51:01] (DEBUG) urllib3.connectionpool._new_conn():971 - Starting new HTTPS connection (1): update-master.ixsystems.com:443
[2023/09/21 17:51:02] (DEBUG) urllib3.connectionpool._make_request():452 - https://update-master.ixsystems.com:443 "GET /updates/ix_crl.pem HTTP/1.1" 200 1028
[2023/09/21 17:51:02] (DEBUG) freenasOS.Configuration.CheckFreeSpace():77 - CheckFreeSpace(path=/tmp/tmpbncdy8it.pem, pool=None, required=1028)
[2023/09/21 17:51:02] (DEBUG) freenasOS.Configuration.TryGetNetworkFile():745 - TryGetNetworkFile(['https://update-master.ixsystems.com/updates/ix_crl.pem']):  Read 1028 bytes total
[2023/09/21 17:51:03] (DEBUG) freenasOS.Configuration.TryGetNetworkFile():606 - TryGetNetworkFile(['https://update-master.ixsystems.com/updates/ix_crl.pem'])
[2023/09/21 17:51:03] (DEBUG) urllib3.connectionpool._new_conn():971 - Starting new HTTPS connection (1): update-master.ixsystems.com:443
[2023/09/21 17:51:03] (DEBUG) urllib3.connectionpool._make_request():452 - https://update-master.ixsystems.com:443 "GET /updates/ix_crl.pem HTTP/1.1" 200 1028
[2023/09/21 17:51:03] (DEBUG) freenasOS.Configuration.CheckFreeSpace():77 - CheckFreeSpace(path=/tmp/tmpl0cyfn6p.pem, pool=None, required=1028)
[2023/09/21 17:51:03] (DEBUG) freenasOS.Configuration.TryGetNetworkFile():745 - TryGetNetworkFile(['https://update-master.ixsystems.com/updates/ix_crl.pem']):  Read 1028 bytes total
[2023/09/21 18:07:27] (DEBUG) EnclosureService.sync_zpool():193 - Skipping enclosure slot to zpool sync because no enclosures found
[2023/09/21 18:07:30] (INFO) DiskService.log_disk_info():122 - Found disks: {'ada5': {'name': 'ada5', 'ident': '222306A01A03', 'lunid': '5001b448b0c781bf', 'serial': '222306A01A03'}, 'ada4': {'name': 'ada4', 'ident': 'WD-C82LBEMK', 'lunid': '50014ee21557ff0b', 'serial': 'WD-C82LBEMK'}, 'ada3': {'name': 'ada3', 'ident': 'WD-C82PHSPK', 'lunid': '50014ee2155835a9', 'serial': 'WD-C82PHSPK'}, 'ada2': {'name': 'ada2', 'ident': 'WD-C82PLLDK', 'lunid': '50014ee2c00362bd', 'serial': 'WD-C82PLLDK'}, 'ada1': {'name': 'ada1', 'ident': 'WD-C82PNMZK', 'lunid': '50014ee2155807a4', 'serial': 'WD-C82PNMZK'}, 'ada0': {'name': 'ada0', 'ident': '23022D805122', 'lunid': '5001b448b28f0b02', 'serial': '23022D805122'}}
[2023/09/21 20:51:18] (ERROR) libzfs.query():477 - Failed to retrieve dataset handle for boot-pool: pool I/O is currently suspended
[2023/09/23 16:13:27] (INFO) middlewared.__init__():799 - Starting TrueNAS-13.0-U5.2 middleware
[2023/09/23 19:13:29] (DEBUG) middlewared.setup():1717 - Timezone set to Europe/Athens
[2023/09/23 19:13:31] (DEBUG) middlewared.setup():2949 - Certificate setup for System complete
[2023/09/23 19:13:31] (INFO) middlewared.devd_listen():55 - devd connection established
[2023/09/23 19:13:31] (WARNING) middlewared.send_event():1370 - Event 'failover.status' not registered.
[2023/09/23 19:13:31] (DEBUG) middlewared.__plugins_setup():916 - All plugins loaded
[2023/09/23 19:13:31] (DEBUG) middlewared.__initialize():1630 - Accepting connections
[2023/09/23 19:13:31] (INFO) DiskService.log_disk_info():122 - Found disks: {'ada5': {'name': 'ada5', 'ident': '222306A01A03', 'lunid': '5001b448b0c781bf', 'serial': '222306A01A03'}, 'ada4': {'name': 'ada4', 'ident': 'WD-C82LBEMK', 'lunid': '50014ee21557ff0b', 'serial': 'WD-C82LBEMK'}, 'ada3': {'name': 'ada3', 'ident': 'WD-C82PHSPK', 'lunid': '50014ee2155835a9', 'serial': 'WD-C82PHSPK'}, 'ada2': {'name': 'ada2', 'ident': 'WD-C82PLLDK', 'lunid': '50014ee2c00362bd', 'serial': 'WD-C82PLLDK'}, 'ada1': {'name': 'ada1', 'ident': 'WD-C82PNMZK', 'lunid': '50014ee2155807a4', 'serial': 'WD-C82PNMZK'}, 'ada0': {'name': 'ada0', 'ident': '23022D805122', 'lunid': '5001b448b28f0b02', 'serial': '23022D805122'}, 'nvd0': {'name': 'nvd0', 'ident': 'P300ADBB22111814488', 'lunid': '0000000000000001', 'serial': 'P300ADBB22111814488'}}
[2023/09/23 19:13:32] (DEBUG) EtcService.generate():445 - No new changes for /etc/krb5.conf
[2023/09/23 19:13:32] (DEBUG) EtcService.generate():445 - No new changes for /etc/krb5.conf
[2023/09/23 19:13:33] (DEBUG) EtcService.generate():445 - No new changes for /etc/hosts
[2023/09/23 19:13:33] (DEBUG) EtcService.generate():445 - No new changes for /etc/pam.d/sshd
[2023/09/23 19:13:33] (DEBUG) EtcService.generate():429 - mako:local/users.oath file removed.
[2023/09/23 19:13:33] (DEBUG) EtcService.generate():429 - mako:local/openvpn/server/openvpn_server.conf file removed.
[2023/09/23 19:13:33] (DEBUG) EtcService.generate():429 - mako:local/openvpn/client/openvpn_client.conf file removed.
[2023/09/23 19:13:33] (DEBUG) EtcService.generate():429 - mako:wireguard/wg0.conf file removed.
[2023/09/23 19:13:42] (ERROR) asyncio.default_exception_handler():1753 - Task exception was never retrieved


Those 2 crashes happened under TrueNAS-13.0-U3.* i since upgraded to TrueNAS-13.0-U5.3
system has
poolA raidz1 4XHDD
poolB mirror 2xSSD
boot-pool M.2 device

consumer grade hardware:
asrock z590 phantom
intel i5-10400f
64GB Kingston non ecc ram

I would really appreciate a tip on how to handle this situation.
where to look, what to change,suggestions.

ps: due to distance,im unable to interact with local keyboard/screen
Thank you.
 
Top