Lost GUI - issue with a broadcom 10gig card.

Ashley Drees

Dabbler
Joined
Oct 6, 2015
Messages
20
I have FreeBSD 11.2-STABLE (FreeNAS.amd64) #0 r325575+9a3c7d8b53f(HEAD)
On a dell 2900 with 24 gig memory with 8 2DB drives on an Intel 63XXESB2 SATA300 controller and Broadcom BCM57412 NetXtreme-E 10Gb Ethernet + the usual built in NIC

Sometime between me installing Broadcom BCM57412 NetXtreme-E 10Gb Ethernet and now - I started losing the GUI - first time it went I blamed myself thinking I had done something dull on the command line that locked it up - I rebooted from the command line, this has now happened again and this time i suspect the card/driver.

An error in the nginx log shows... May 7 12:25:29 server.domain.tld nginx: 2019/05/07 12:25:29 [error] 57255#100628: *109 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 192.168.9.22, server: localhost, request: "GET /websocket HTTP/1.1", upstream: "http://127.0.0.1:6000/websocket", host: "192.168.9.13" . - which looks like the web socket attached to the NIC with ip 192.168.9.13 seems to be broken in some way.

192.168.9.22 is my workstation trying to get to the GUI.

I tried to restart django and nginx - but that does not help, when i restarted the whole thing it came back till netinfo ran.

I think I have something happening with the 10gig NIC card somehow.

the unused port gives

bnxt1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:0a:f7:ea:17:21
hwaddr 00:0a:f7:ea:17:21
nd6 options=9<PERFORMNUD,IFDISABLED>
media: Ethernet autoselect (Unknown <full-duplex,rxpause,txpause>)
status: no carrier

Nothing plugged in - however the active port

ifconfig bnxt0
bnxt0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=a500b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6>
ether 00:0a:f7:ea:17:20
hwaddr 00:0a:f7:ea:17:20
inet 192.168.9.13 netmask 0xffffff00 broadcast 192.168.9.255
nd6 options=9<PERFORMNUD,IFDISABLED>

Hangs at that point, however, our shared resources are still available afp/smb/ssh over that fibre connection - so i don't want to force a reboot at this point - also the console seems to have frozen.

Ah... one other thing, its uplink is a UniFi Switch 16XG, which does not seem to be having any issues.

However the console shows "bnxt0 TX(2) desc avail = 21. pdix = 172" with various pdix values 173, 102, 222 and 7
The console is also locked up.
 
Last edited:

Ashley Drees

Dabbler
Joined
Oct 6, 2015
Messages
20
Further info
I rebooted this AM, but by 17:00 the machine had hung up again... i noticed....

root 20405 0.0 0.0 6868 2408 - LN 12:22 0:00.01 /sbin/ifconfig bnxt0

AND also...

May 8 12:22:30 hostname.domain.tld sudo: netdata : TTY=unknown ; PWD=/etc/local/netdata ; USER=root ; COMMAND=/etc/find_alias_for_smtplib.sh -t

Once i see those ifconfig with the full path i know it is hung... and in the tree they relate to netdata.

I CAN investigate bnxt1 and that comes up fine but the loopback for the GUI is gone as is control over bnxt0 with ifconfig once the netdata ifconfig has run once... THOUGH i am not sure if that is a symptom or a cause.

I have an Intel card coming tomorrow and will replace this Broadcom with that and see how things go.
 
Top