SOLVED On board NIC link state on and off all the time

ULTIoNT

Dabbler
Joined
Jul 14, 2023
Messages
24
Hi there. I have set up an OpenWrt VM in TrueNAS Scale, and I have been experiencing internet drop everyday in the morning so far. At first I thought maybe there is something wrong with the VM so I just reboot the VM but that doesn't help at all and I have to restart TrueNAS to bring the internet back up. When I check out the log, I find the NIC shifting between up and down all the time. Below is just an excerpt and there is no error detected
Code:
Aug 13 10:58:50 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 10:59:04 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 10:59:17 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 10:59:32 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 10:59:47 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:00:01 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:00:16 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:00:30 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:00:45 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:00:59 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:01:14 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:01:28 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:01:42 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:01:57 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:02:12 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:02:27 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:02:41 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:02:56 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:03:10 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:03:24 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:03:38 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:03:53 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:04:07 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:04:22 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:04:31 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 13 11:04:40 TrueNAS kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
even if I have shutdown the VM using it and resetting its configuration. One of the major reasons for me to build this new NAS is just making the router, the files and all my services in a same place and the router VM even underperforming my 6 years old Netgear router is driving me crazy.
Honestly what it happening? Now I am ridiculously relying on the Wi-Fi as the fail-over connection in my OpenWrt VM which has been frowned upon so many time here because it's considered far less stable than a cabled connection but it is the cable connection that fails all the time.
The NIC involved is a
Code:
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (17) I219-V (rev 11)
        Subsystem: ASRock Incorporation Ethernet Connection (17) I219-V
        Flags: bus master, fast devsel, latency 0, IRQ 147, IOMMU group 12
        Memory at 75400000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Kernel driver in use: e1000e
        Kernel modules: e1000e

which is just a very common onboard 1Gbps NIC. Removing this PCI device and rescan doesn't help either. I have read about it somewhere (in PVE forums although I installed TrueNAS on bare metal and use it as the hypervisor) that maybe there is a bug in using offloading in the NIC but I tried turning it off and it makes no difference. Maybe it works only if I have turned off offloading before things go wrong shall I be optimistic. I have set up a post-init script as follows and will report back if the problem persists.
Code:
ethtool -K enp0s31f6 tso off gso off


Edit:
So I find I219-V NIC is not being officially supported? OK, I missed that when picking up the components. So the problem I am seeing is supposed to continue happening and never will be fixed? What about passing the NIC through to the VM directly as a PCI device instead of binding to it in TrueNAS? Is that possible?

Edit 2:
Also switching to e1000e instead of virtio to see if things will change as I heard that e1000e exchange some performance for compatibility and I219-V uses e1000e driver on the host machine anyway, but I guess this is only helpful if the root cause of my problem is how the VM uses the NIC rather than TrueNAS handling the physical NIC itself.

Edit 3:
So far, after I disabled offloading in a post-init script, I haven't seen any reset of the NIC in the system log yet, but I think I still need to wait for at least several days before concluding. If the hardware setup is of interest:
  • Motherboard: Asrock B660M Pro RS
  • CPU: i3 13100
  • RAM: 16Gx4 at 3200
  • Hard drive: Silicon Power 1T SP001TBP34A60M28 for boot; Kingston KC3000 1T stripe; Seagate Exos X14 14Tx4 in RAIDz-1
  • Hard disk controllers: N/A
  • Network cards: on-board I219-v; Intel X540-AT2
Edit 4:
No interface down has been reported from TrueNAS so far but the OpenWRT VM does see a link down just now, which I get passed by simply restarting the interface. I am not sure if this is a problem with my new setup or my modem or the ISP because I used to encounter similar issues when I was using my Netgear router. A bit of a digression but to automate the interface restart when the link is down, you can put a script in
Code:
/etc/hotplug.d/net/
in OpenWRT for that.

Edit 5:
Despite many other problems with TrueNAS Scale (as I just upgraded to Cobia Beta) that makes me reboot the server from time to time, the OpenWRT VM is now working just fine.

Edit 6:
Another week with no networking problems has passed. I am marking this as solved.
 
Last edited:

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
you are using unsupported hardware. its unsupported because it does weird things like this. YMMV
 

ULTIoNT

Dabbler
Joined
Jul 14, 2023
Messages
24
you are using unsupported hardware. its unsupported because it does weird things like this. YMMV
I am aware of that and I already mentioned it in an update to the original post before your reply, but I still look forward to someone who has used similar hardware to share his workaround if there exist one, and if I get this to work, I would like to share my solution here as well so people on the same boat can refer to. To be honest, this is indeed a very common NIC.

Speaking of hardware compatibility. I actually find conflicting information in this forum. Some said it was supported in 2022 and some said it was still not supported in 2023. I just go with the more recent one but I don’t know where to find a credible list of compatible devices. I looked up the SCALE Hardware Guide and in the ethernet networking section it says Intel interfaces are one of the best-supported options without giving me exceptions. I also searched for thing da regarding motherboard and still doesn’t find relevant info to my NIC.

You reply is also not helpful in that it doesn't seem to written after reading the possible options I have thought of. For example if indeed this NIC is not supported, how about I pass it through to the VM as a PCI device? Will that make a difference?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
I didn't just mean the NIC, the motherboard is a poor choice, the RAM is a poor choice. your whole setup is an unsupported setup and results will vary.
the NICs on supported motherboards are supported.

as this is not a server motherboard, anything you pass though is has pretty good chance to have mixed results that you will spend large amounts of time trying to troubleshoot. the members of this forum generally avoid this by getting supported hardware instead. getting help here generally comes from a few volunteers who dont bother to answer questions when they see the hardware is unsupported as they consider that a complete waste of the free time they are donating.

there are other NAS systems that cater far better to the random hardware configurations setups, like OMV and unraid.

TrueNAS is an appliance built for supported server hardware and can be an frustrating experience on anything else.

one way you can work around it is to buy a support PCIe NIC and use that. the intel recomendations are usually for NIC's being added in, so that peole are buying PCIe cards that will work with the OS.

you are, of course, free to do as you like, but it's largely uncharted territory.
 

ULTIoNT

Dabbler
Joined
Jul 14, 2023
Messages
24
I didn't just mean the NIC, the motherboard is a poor choice, the RAM is a poor choice. your whole setup is an unsupported setup and results will vary.
the NICs on supported motherboards are supported.

as this is not a server motherboard, anything you pass though is has pretty good chance to have mixed results that you will spend large amounts of time trying to troubleshoot. the members of this forum generally avoid this by getting supported hardware instead. getting help here generally comes from a few volunteers who dont bother to answer questions when they see the hardware is unsupported as they consider that a complete waste of the free time they are donating.

there are other NAS systems that cater far better to the random hardware configurations setups, like OMV and unraid.

TrueNAS is an appliance built for supported server hardware and can be an frustrating experience on anything else.

one way you can work around it is to buy a support PCIe NIC and use that. the intel recomendations are usually for NIC's being added in, so that peole are buying PCIe cards that will work with the OS.

you are, of course, free to do as you like, but it's largely uncharted territory.
I can understand active members here like you have been fed up with questions regarding any hardware that is not what you consider enterprise-level (your signature has well spoken for that) but I doubt that making the disguised shift of concept between what's compatible and what's recommended should be a consensus, given that it is stipulated in the Hardware Guide I referred to above in the very first line that "From repurposed systems to highly custom builds, the fundamental freedom of TrueNAS is the ability to run it on almost any x86 computer."

By disabling offloading, the NIC has been running alright and I will conclude this post, if it continues to function correctly in the couple of days to come, with my expedient which is found in other open communities so that others don't have to search that hard or make a post only to be tuned down knowing their system is not "supported". But if it still is malfunctioning, I will consider your advice to add a Intel I226 NIC to my last PCIe slot and use that instead after I finish tinkering around the PCI pass through solution.
 
Last edited:

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
compatible and what's recommended should be a consensus
no. the compatibility and recomendations are driven by the developer of the appliance. this is not a standard OS, it is a limited OS developed by an enterprise to provide enterprise services to enterprises. they have then choses to make this available to the rest of us, with no licensing bull(^(%, and in return, they get a large sample size for QA and bug reporting.

a great many of their software decisions are driven by this, and one of the consequences of this is that hardware that is considered mediocre gets flat out ignored, because why would they pay a developer for it to begin with? there would have to be a business case for that.

the addition of TrueNAS scale, using Debian, has already enabled a significantly larger amount of "meh" gear that FreeBSD never bothered to support because it's a pain with limited actual benefit, and noticeably increased the number of people coming to the forums to troubleshoot their unfortunately planned "Server" based on the Linux Tech Tips ('tech', not 'server' :wink:) video where they throw some random crap together and stuff TrueNAS on it, with literally no intention of ever using it (they pull builds apart often)

if you're clearly just doing an experiment, I wouldn't care, but as soon as you start talking about anything holding data I get....nervous.

my current job is as a backup admin at a company who just had a significant cyber incident where we are restoring nearly the entire company, soooooooo.
I *might* be a teensy bit paranoid about data protection....
 

ULTIoNT

Dabbler
Joined
Jul 14, 2023
Messages
24
Well, after disabling offloading, everything seems to be working perfectly.

I would say if I have unrestricted budget and readily available access to server hardware, plus they fit in the position I reserved for them, as well as meeting the aesthetic and noise-level requirements of mine, I would definitely go for them. But the reality is people have to do tradeoffs. It is the safest to always repeatedly tell people to do the most correct thing when they turn to you for help but sometimes that's also the least helpful.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
But the reality is people have to do tradeoffs.
yes. and one of those tradeoffs is using a NAS OS that better handles random gear. that is not truenas, which is literally designed and tested for enterprise hardware and tends to give less desirable results when you go outside it's design.

like how you can't install drivers or software into truenas. the reality is that it's just not designed for that.
 
Top