DemohFoxfire
Dabbler
- Joined
- May 2, 2023
- Messages
- 11
This is still a buildup to put into production but I ran into some pretty weird problems.
supermicro x9 series board
Intel DC P4608 6.4tb (2x3.2)
LSI 9211-8i active with drives attached
LSI 9300-8i in the system so I can firmware update it
Intel XXV710-DA2
Using the 3008 firmware from a guide here I went to sas3flash the 9300-8i in the system from putty ssh while I was preparing for some testing using a 2019 server VM on esxi 7 connected via both ports on the intel card. windows iscsi initiator (vm passthrough nics as if they were raw networks instead of vmware iscsi) as I was just doing some benchmarking. The drives on the 9211 were presented via iscsi to an interface on ixl0 while the nvme was presented as 2 targets on ixl1.
I was currently writing a series of 4gb files to the 9211 drives when I was performing the firmware update on the 9300. Since my -listalls came back with 1 controller each and both were id 0 I went ahead without specifying -c
after a while I heard fans spin up and the server was mid-post. I saw the rom for the 9300 and entered it, it showed the new firmware. I let the truenas server continue to boot. hitting enter on putty I received the confirmation the session disconnected.
sas3 -listall showed the correct new firmware version. not trusting the flash successful and trying to recreate the reboot I ran the flash again without any activity, didnt bother with the windows server as iscsi was "reconnecting" but never reconnected. The server didnt reboot and sas3flash completed successfully this time and -listall output didnt change.
That in itself is odd, but I decided to restart my testing, delete the failed file and recreate (I was just writing dummy files w/ random data, 4gb each, no big deal) and iscsi wouldnt reconnect.
windows couldnt ping the 2 portals. truenas couldnt ping the 2 windows NICs. wireshark on windows doesnt show any traffic from the truenas server EXCEPT for LLDP packets from the 2 intel mac addresses on the respective windows NICs.
Ive link cycled the DAC cables, changed the ip from 2.1 to 2.2 and back on one of the interfaces, rebooted both servers but I cant get a peep out of the truenas ixl interfaces.
Im at a loss for this one, ill continue testing tomorrow but if anybody has ideas I am all ears. Its not too big of a deal as I can just blow away the entirety of both servers since its all sandbox right now but I would really love to get to the bottom of this one. The up/down were mostly me switching the DAC cables and watching the mac addresses change in wireshark and switching them back.
supermicro x9 series board
Intel DC P4608 6.4tb (2x3.2)
LSI 9211-8i active with drives attached
LSI 9300-8i in the system so I can firmware update it
Intel XXV710-DA2
Using the 3008 firmware from a guide here I went to sas3flash the 9300-8i in the system from putty ssh while I was preparing for some testing using a 2019 server VM on esxi 7 connected via both ports on the intel card. windows iscsi initiator (vm passthrough nics as if they were raw networks instead of vmware iscsi) as I was just doing some benchmarking. The drives on the 9211 were presented via iscsi to an interface on ixl0 while the nvme was presented as 2 targets on ixl1.
I was currently writing a series of 4gb files to the 9211 drives when I was performing the firmware update on the 9300. Since my -listalls came back with 1 controller each and both were id 0 I went ahead without specifying -c
Code:
root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# sas2flash -listall LSI Corporation SAS2 Flash Utility Version 16.00.00.00 (2013.03.01) Copyright (c) 2008-2013 LSI Corporation. All rights reserved Adapter Selected is a LSI SAS: SAS2008(B2) Num Ctlr FW Ver NVDATA x86-BIOS PCI Addr ---------------------------------------------------------------------------- 0 SAS2008(B2) 20.00.07.00 14.01.00.08 07.39.02.00 00:85:00:00 Finished Processing Commands Successfully. Exiting SAS2Flash. root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# sas3flash -listall Avago Technologies SAS3 Flash Utility Version 16.00.00.00 (2017.05.02) Copyright 2008-2017 Avago Technologies. All rights reserved. Adapter Selected is a Avago SAS: SAS3008(C0) Num Ctlr FW Ver NVDATA x86-BIOS PCI Addr ---------------------------------------------------------------------------- 0 SAS3008(C0) 15.00.00.00 0e.00.00.07 08.35.00.00 00:84:00:00 Finished Processing Commands Successfully. Exiting SAS3Flash. root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# root@truenas[/tmp/firmware/9300-8i]# sas3flash -o -f SAS9300_8i_IT.bin Avago Technologies SAS3 Flash Utility Version 16.00.00.00 (2017.05.02) Copyright 2008-2017 Avago Technologies. All rights reserved. Advanced Mode Set Adapter Selected is a Avago SAS: SAS3008(C0) Executing Operation: Flash Firmware Image Firmware Image has a Valid Checksum. Firmware Version 16.00.12.00 Firmware Image compatible with Controller. Valid NVDATA Image found. NVDATA Major Version 0e.01 Checking for a compatible NVData image... NVDATA Device ID and Chip Revision match verified. NVDATA SubSystem Vendor and SubSystem Device ID match verified. NVDATA Versions Compatible. Valid Initialization Image verified. Valid BootLoader Image verified. Beginning Firmware Download... Firmware Download Successful. Verifying Download... Firmware Flash Successful. Resetting Adapter...
after a while I heard fans spin up and the server was mid-post. I saw the rom for the 9300 and entered it, it showed the new firmware. I let the truenas server continue to boot. hitting enter on putty I received the confirmation the session disconnected.
sas3 -listall showed the correct new firmware version. not trusting the flash successful and trying to recreate the reboot I ran the flash again without any activity, didnt bother with the windows server as iscsi was "reconnecting" but never reconnected. The server didnt reboot and sas3flash completed successfully this time and -listall output didnt change.
That in itself is odd, but I decided to restart my testing, delete the failed file and recreate (I was just writing dummy files w/ random data, 4gb each, no big deal) and iscsi wouldnt reconnect.
windows couldnt ping the 2 portals. truenas couldnt ping the 2 windows NICs. wireshark on windows doesnt show any traffic from the truenas server EXCEPT for LLDP packets from the 2 intel mac addresses on the respective windows NICs.
Ive link cycled the DAC cables, changed the ip from 2.1 to 2.2 and back on one of the interfaces, rebooted both servers but I cant get a peep out of the truenas ixl interfaces.
Code:
root@truenas[~]# root@truenas[~]# root@truenas[~]# root@truenas[~]# ping 172.16.1.10 PING 172.16.1.10 (172.16.1.10): 56 data bytes ^C --- 172.16.1.10 ping statistics --- 62 packets transmitted, 0 packets received, 100.0% packet loss root@truenas[~]# root@truenas[~]# root@truenas[~]# ping 172.16.2.10 PING 172.16.2.10 (172.16.2.10): 56 data bytes ^C --- 172.16.2.10 ping statistics --- 90 packets transmitted, 0 packets received, 100.0% packet loss root@truenas[~]# root@truenas[~]# root@truenas[~]# root@truenas[~]# ifconfig igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 00:25:90:4f:93:c0 inet 192.168.10.164 netmask 0xffffff00 broadcast 192.168.10.255 media: Ethernet autoselect (1000baseT <full-duplex>) status: active nd6 options=9<PERFORMNUD,IFDISABLED> igb1: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 00:25:90:4f:93:c1 media: Ethernet autoselect status: no carrier nd6 options=9<PERFORMNUD,IFDISABLED> igb2: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 00:25:90:4f:93:c2 media: Ethernet autoselect status: no carrier nd6 options=9<PERFORMNUD,IFDISABLED> igb3: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 00:25:90:4f:93:c3 media: Ethernet autoselect status: no carrier nd6 options=9<PERFORMNUD,IFDISABLED> ixl0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 40:a6:b7:9a:c6:24 inet 172.16.1.1 netmask 0xffffff00 broadcast 172.16.1.255 media: Ethernet autoselect (10Gbase-Twinax <full-duplex>) status: active nd6 options=9<PERFORMNUD,IFDISABLED> ixl1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 40:a6:b7:9a:c6:25 inet 172.16.2.1 netmask 0xffffff00 broadcast 172.16.2.255 media: Ethernet autoselect (10Gbase-Twinax <full-duplex>) status: active nd6 options=9<PERFORMNUD,IFDISABLED> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7 inet 127.0.0.1 netmask 0xff000000 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> pflog0: flags=0<> metric 0 mtu 33160 groups: pflog root@truenas[~]# root@truenas[~]# root@truenas[~]# root@truenas[~]# dmesg | grep ixl0 ixl0: <Intel(R) Ethernet Controller XXV710 for 25GbE SFP28 - 2.3.1-k> mem 0xfa000000-0xfaffffff,0xfb008000-0xfb00ffff irq 50 at device 0.0 numa-domain 1 on pci14 ixl0: fw 6.0.48442 api 1.7 nvm 6.01 etid 80003646 oem 1.263.0 ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, MDIO & I2C ixl0: Using 1024 TX descriptors and 1024 RX descriptors ixl0: Using 8 RX queues 8 TX queues ixl0: Using MSI-X interrupts with 9 vectors ixl0: taskqgroup_attach_cpu failed 22 ixl0: Ethernet address: 40:a6:b7:9a:c6:24 ixl0: Allocating 8 queues for PF LAN VSI; 8 queues active ixl0: PCI Express Bus: Speed 8.0GT/s Width x8 ixl0: SR-IOV ready ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl0: link state changed to UP debugnet_any_ifnet_update: Bad dn_init result from ixl0 (ifp 0xfffff801050f7800), ignoring. ixl0: link state changed to DOWN ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl0: link state changed to UP ixl0: link state changed to DOWN ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl0: link state changed to UP ixl0: link state changed to DOWN ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl0: link state changed to UP root@truenas[~]# root@truenas[~]# root@truenas[~]# root@truenas[~]# dmesg | grep ixl1 ixl1: <Intel(R) Ethernet Controller XXV710 for 25GbE SFP28 - 2.3.1-k> mem 0xf9000000-0xf9ffffff,0xfb000000-0xfb007fff irq 50 at device 0.1 numa-domain 1 on pci14 ixl1: fw 6.0.48442 api 1.7 nvm 6.01 etid 80003646 oem 1.263.0 ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, MDIO & I2C ixl1: Using 1024 TX descriptors and 1024 RX descriptors ixl1: Using 8 RX queues 8 TX queues ixl1: Using MSI-X interrupts with 9 vectors ixl1: taskqgroup_attach_cpu failed 22 ixl1: Ethernet address: 40:a6:b7:9a:c6:25 ixl1: Allocating 8 queues for PF LAN VSI; 8 queues active ixl1: PCI Express Bus: Speed 8.0GT/s Width x8 ixl1: SR-IOV ready ixl1: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl1: link state changed to UP debugnet_any_ifnet_update: Bad dn_init result from ixl1 (ifp 0xfffff81483a31800), ignoring. ixl1: link state changed to DOWN ixl1: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl1: link state changed to UP ixl1: link state changed to DOWN ixl1: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl1: link state changed to UP ixl1: link state changed to DOWN ixl1: Link is up, 10 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl1: link state changed to UP root@truenas[~]# root@truenas[~]#
Im at a loss for this one, ill continue testing tomorrow but if anybody has ideas I am all ears. Its not too big of a deal as I can just blow away the entirety of both servers since its all sandbox right now but I would really love to get to the bottom of this one. The up/down were mostly me switching the DAC cables and watching the mac addresses change in wireshark and switching them back.