Boot issues - Unable to obtain IP/all pools disconnected on every reboot

Killamus

Cadet
Joined
Nov 8, 2021
Messages
2
Hey, I'm new to TrueNAS, but a longtime sysadmin. I'm having a few issues:
1. I'm unable to obtain an IP on startup. I initially configured the system with DHCP, only having a singular NIC. I then tried to switch it over to static, but the web UI doesn't allow me to, and I can't figure out how to set it static via config files. Ever since I tried with the webUI though, it times out after about 10 minutes (or until I hit ctrl+C) without an IP and the physical interface (bge0) down. The configured VLANs don't work until I log in via console, and run 'ifconfig bge0 up' and 'dhclient bge0' to get an IP. after that, network works fine.
2. With each reboot, all non-boot pools are offline. I have to export them and then reconnect them to work. They also don't show up in the CLI at all, I need to use the web UI to do this - not an issue, but very odd and annoying, as I need to log into the console to get an IP first, then go to my desktop and use the web UI to "fix" the pools.
3. Half of my configurations are lost on reboot. Those include:
3.1. SSH host key
3.2. hostname
3.3. all services are disabled, although configurations are kept (except for the UPS config, which is lost)
3.4 HTTPS is disabled, despite showing as enabled in the UI. I have to change the HTTPS port, save, change it back, save in order to access via HTTPS. I'm unable to get the redirect to work no matter what I do.
3.5 pkg repos are lost - I have to disable local repos and re-enable BSD repos
3.6 Everything else is saved, however.

So far I've tried:
Replacing the NIC, including disabling the onboard nic (Same issues)
Rebooting with/without drives installed
Rebooting via WebUI, CLI, and pressing the power button
Changing to static via WebUI (When clicking 'test', it times out. Verifying via the console, all this does is shut down the attached NIC - it never brings it back up with a static IP, although it does come back online after test with a DHCP request)

Hardware info:
  • Motherboard make and model: ASRock Z77 Extreme
  • CPU make and model: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
  • RAM quantity: 24G
  • Hard drives, quantity, model numbers, and RAID configuration, including boot drives
    • descr: ST4000DM004-2CV104 (/mnt/Media, striped)
      descr: HGST HDS724040ALE640 (/mnt/Temp, solo)
      descr: WDC WD40EFZX-68AWUN0 (/mnt/Media, striped)
      descr: WDC WD40EFZX-68AWUN0 (/mnt/Media, striped)
      descr: WDC WD40EFZX-68AWUN0 (/mnt/Media, striped)
      descr: WDC WD1600BEKT-00PVMT0 (boot)
      descr: TOSHIBA MQ01ABD100 (Failed, need to remove)
      descr: SATA SSD (/mnt/Media, cache)
      descr: SATA SSD (/mnt/Media, log, mirror)
      descr: SATA SSD (/mnt/Media, log, mirror)
      descr: SATA SSD (unused)
  • Hard disk controllers
    • 2x marvel PCI-E ->Sata controllers, AHCI passthru mode
  • Network cards: NetLink BCM57781 Gigabit Ethernet PCIe (Also tried generic Intel card, same results)
Errors I can see in /var/log/messages:
```
Nov 8 16:25:13 truenas kernel: interface uhid.1 already present in the KLD 'kernel'!
Nov 8 16:25:13 truenas kernel: linker_load_file: /boot/kernel/uhid.ko - unsupported file type
Nov 8 16:25:13 truenas kernel: <118>kldload: an error occurred while loading module uhid.ko. Please check dmesg(8) for more details.
Nov 8 16:25:13 truenas kernel: <118>Autoloading module: wmt.ko
Nov 8 16:25:13 truenas kernel: interface wmt.1 already present in the KLD 'kernel'!
Nov 8 16:25:13 truenas kernel: linker_load_file: /boot/kernel/wmt.ko - unsupported file type
Nov 8 16:25:13 truenas kernel: <118>kldload: an error occurred while loading module wmt.ko. Please check dmesg(8) for more details.
Nov 8 16:25:13 truenas kernel: <118>Starting ums0 moused.
Nov 8 16:25:13 truenas kernel: <118>Starting zfsd.
Nov 8 16:25:13 truenas kernel: <6>lo0: link state changed to UP
Nov 8 16:25:13 truenas kernel: <118>Traceback (most recent call last):
Nov 8 16:25:13 truenas kernel: <118> File "/usr/local/bin/midclt", line 10, in <module>
Nov 8 16:25:13 truenas kernel: <118> sys.exit(main())
Nov 8 16:25:13 truenas kernel: <118> File "/usr/local/lib/python3.9/site-packages/middlewared/client/client.py", line 662, in main
Nov 8 16:25:13 truenas kernel: <118> thread.join(args.timeout)
Nov 8 16:25:13 truenas kernel: <118> File "/usr/local/lib/python3.9/threading.py", line 1057, in join
Nov 8 16:25:13 truenas kernel: <118> self._wait_for_tstate_lock(timeout=max(timeout, 0))
Nov 8 16:25:13 truenas kernel: <118> File "/usr/local/lib/python3.9/threading.py", line 1069, in _wait_for_tstate_lock
Nov 8 16:25:13 truenas kernel: <118> elif lock.acquire(block, timeout):
Nov 8 16:25:13 truenas kernel: <118>KeyboardInterrupt
Nov 8 16:25:13 truenas kernel: <118>Script /usr/local/etc/rc.d/middlewared interrupted
Nov 8 16:25:13 truenas kernel: <118>Loading early kernel modules:
Nov 8 16:25:13 truenas kernel: <118>Syncing disks...
Nov 8 16:25:13 truenas kernel: <118>Failed to run middleware call. Daemon not running?
Nov 8 16:25:13 truenas kernel: <118>Failed to run middleware call. Daemon not running?
Nov 8 16:25:13 truenas kernel: <118>Failed to run middleware call. Daemon not running?
Nov 8 16:25:13 truenas kernel: <118>Failed to run middleware call. Daemon not running?
Nov 8 16:25:13 truenas kernel: <118>Alarm clock
Nov 8 16:25:13 truenas kernel: <118>Starting file system checks:
Nov 8 16:25:13 truenas kernel: <118>Setting hostuuid: 87b3fab0-37a0-11ec-924b-bc5ff479ecbe.
Nov 8 16:25:13 truenas kernel: <118>Setting hostid: 0x4df97d06.
Nov 8 16:25:13 truenas kernel: <118>Mounting local filesystems:.
Nov 8 16:25:13 truenas kernel: <118>Failed to run middleware call. Daemon not running?
```
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
  • Motherboard make and model: ASRock Z77 Extreme
    • Thats a gamer board - so its got loads of crap on it you don't want
  • CPU make and model: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
    • Yeah - OK. Doesn't support ECC
  • RAM quantity: 24G
    • OK - it more than minimum. How much you actually need depends on use case. But this is fine for quantity. Not ECC though
  • Hard drives, quantity, model numbers, and RAID configuration, including boot drives
    • descr: ST4000DM004-2CV104 (/mnt/Media, striped)
      descr: HGST HDS724040ALE640 (/mnt/Temp, solo)
      descr: WDC WD40EFZX-68AWUN0 (/mnt/Media, striped)
      descr: WDC WD40EFZX-68AWUN0 (/mnt/Media, striped)
      descr: WDC WD40EFZX-68AWUN0 (/mnt/Media, striped)
      descr: WDC WD1600BEKT-00PVMT0 (boot)
      descr: TOSHIBA MQ01ABD100 (Failed, need to remove)
      descr: SATA SSD (/mnt/Media, cache)
      descr: SATA SSD (/mnt/Media, log, mirror)
      descr: SATA SSD (/mnt/Media, log, mirror)
      descr: SATA SSD (unused)
      • I am not working my way through these. But check very carefully if any are SMR. If they are then take them outside and burn them (/use for something else). The Seagate is possibly SMR
  • Hard disk controllers
    • 2x marvel PCI-E ->Sata controllers, AHCI passthru mode
      • Some extra SATA controller are OK - almost all are total junk. Use proper HBA's
  • Network cards: NetLink BCM57781 Gigabit Ethernet PCIe (Also tried generic Intel card, same results)
    • No, just no. Use a proper NIC. Intel server card or similar
Now for the error messages and such.
You assign an IP address via the GUI. So use the initial DHCP address to access the GUI. Do NOT use the shell / command line.
You haven't said what version you are running, but assuming core, current version configure the network via the GUI (Network/Interfaces). You will also need to adjust Global Configuration for gateway, DNS etc

As for the rest, given you have been messing around inside config files I would start again. Use a completely new build and import your pool/pools

BTW
descr: SATA SSD (/mnt/Media, log, mirror)
descr: SATA SSD (/mnt/Media, log, mirror)
What are you doing than requires a SLOG?​

descr: SATA SSD (/mnt/Media, cache)
This will be hurting performance. It almost certainly doesn't do what you think it does and its likely way too big for your ARC.
 

Killamus

Cadet
Joined
Nov 8, 2021
Messages
2
You assign an IP address via the GUI. So use the initial DHCP address to access the GUI. Do NOT use the shell / command line.
You haven't said what version you are running, but assuming core, current version configure the network via the GUI (Network/Interfaces). You will also need to adjust Global Configuration for gateway, DNS etc
The global configuration is what I was missing with setting the IP, thanks! I wish there was an error somewhere stating that these were unset when using DHCP.
 
Top