Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

40G NIC's not getting full performance

Joined
Dec 29, 2014
Messages
705
Thanks
164
#1
I just upgraded my two FreeNAS systems with Chelsio T580 40G NIC's from Chelsio T520 NIC's. Both systems are Cisco C240 M3S units with LSI-9207-8i and LSI-9207-8e and are running FreeNAS 11.1-U7. The primary unit has dual E5-2637 v2 @ 3.50GHz CPU's and 256G RAM. The secondary has dual E5-2637 @ 3.00GHz CPU's and 128G RAM. The switch is a cisco Nexus3000 C3064PQ and I am using 2M OM4 fiber cables and new QSFP+ modules from FS.COM. I guess the performance isn't terrible with iperf3, but it is certainly a lot less than I had hoped/expected.

Code:
Primary as client:
Connecting to host 192.168.252.23, port 5201
[  5] local 192.168.252.27 port 18741 connected to 192.168.252.23 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.97 GBytes  16.9 Gbits/sec    0    841 KBytes
[  5]   1.00-2.00   sec  1.73 GBytes  14.9 Gbits/sec    0    845 KBytes
[  5]   2.00-3.00   sec  1.96 GBytes  16.8 Gbits/sec    0    872 KBytes
[  5]   3.00-4.00   sec  1.68 GBytes  14.5 Gbits/sec    0    872 KBytes
[  5]   4.00-5.00   sec  1.71 GBytes  14.7 Gbits/sec    0    897 KBytes
[  5]   5.00-6.00   sec  1.73 GBytes  14.8 Gbits/sec    0    912 KBytes
[  5]   6.00-7.00   sec  1.75 GBytes  15.0 Gbits/sec    0    920 KBytes
[  5]   7.00-8.00   sec  1.87 GBytes  16.1 Gbits/sec    0    958 KBytes
[  5]   8.00-9.00   sec  1.70 GBytes  14.6 Gbits/sec    0    982 KBytes
[  5]   9.00-10.00  sec  1.83 GBytes  15.7 Gbits/sec    0    982 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  17.9 GBytes  15.4 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  17.9 GBytes  15.4 Gbits/sec                  receiver

Secondary as client:
Connecting to host 192.168.252.27, port 5201
[  5] local 192.168.252.23 port 38278 connected to 192.168.252.27 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.06 GBytes  17.7 Gbits/sec    0    594 KBytes
[  5]   1.00-2.00   sec  2.05 GBytes  17.6 Gbits/sec    0    594 KBytes
[  5]   2.00-3.00   sec  2.98 GBytes  25.6 Gbits/sec    0   5.44 MBytes
[  5]   3.00-4.00   sec  3.03 GBytes  26.0 Gbits/sec  530    818 KBytes
[  5]   4.00-5.00   sec  2.87 GBytes  24.6 Gbits/sec    0    818 KBytes
[  5]   5.00-6.00   sec  2.80 GBytes  24.1 Gbits/sec    0    818 KBytes
[  5]   6.00-7.00   sec  2.82 GBytes  24.2 Gbits/sec   66    495 KBytes
[  5]   7.00-8.00   sec  2.05 GBytes  17.6 Gbits/sec    0    595 KBytes
[  5]   8.00-9.00   sec  2.05 GBytes  17.6 Gbits/sec    0    595 KBytes
[  5]   9.00-10.00  sec  2.05 GBytes  17.6 Gbits/sec    0    595 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  24.8 GBytes  21.3 Gbits/sec  596             sender
[  5]   0.00-10.02  sec  24.7 GBytes  21.2 Gbits/sec                  receiver

The switch is running NXOS: version 7.0(3)I7(6). I am not using jumbo frames at the moment. The ESXi hosts access the FreeNAS data stores via NFS which are RAIDZ2, I haven't compared anything with the data stores yet. I am only examining the iperf3 data.

These are the tunables from the primary:
1569614988566.png

And these are the tunables from the secondary.
1569615019462.png


The switch port configs could hardly be more vanilla:
Code:
interface Ethernet1/49
  description FreeNAS2
  switchport access vlan 252
  spanning-tree link-type point-to-point

interface Ethernet1/50
  description FreeNAS
  switchport access vlan 252
  spanning-tree link-type point-to-point

FYI, FreeNAS2 is the primary. Any ideas on where to hunt?
 
Joined
Dec 29, 2014
Messages
705
Thanks
164
#3
No joy. It works, but iperf throughput is much less than I expected. I wonder if perhaps the boxes are CPU bound, but I am not sure.
 

Jessep

FreeNAS Experienced
Joined
Aug 19, 2018
Messages
160
Thanks
56
#4
If I recall correctly 40Gb is actually 4X10Gb paired, whereas 100Gb is 4X25Gb paired.

Should you expect to see 40Gb on a single stream?

If I'm way off above second question would be can you generate 40Gb of tracffic? Is it a limitation on send or on receive?
 
Joined
Dec 29, 2014
Messages
705
Thanks
164
#5
I can definitely get more than 10G in a single stream. What you are saying is true of a LAGG, but I don't think that is true of this particular NIC/Switch combination. I can get 15G+ on the slower server and 25G+ on the faster one. That is why I am wondering if it is CPU bound or perhaps there is some more tuning I need to do. The two FreeNAS boxes are the only only I have with 40G connections. I have done a lot of 10G, but not a lot of 40G. I can't help but wonder if it is CPU/resource related since the faster server with more memory can generate significantly more traffic than the slower one, but I am not sure where to go next.
 

Jessep

FreeNAS Experienced
Joined
Aug 19, 2018
Messages
160
Thanks
56
#6
I was thinking of "lanes".
https://www.theregister.co.uk/2017/02/06/decoding_25gb_ethernet_and_beyond/
Faster Ethernet was eventually needed. It was decided that the next steps were to be 40Gb and 100Gb. With everyone having had so much fun the last time 40Gb is actually 4x 10.3125Gb lanes. Meanwhile, 100Gb can come in either 10x 10.3125Gb lanes or 4x 25.78125Gb lanes. Because of course it can.

Also this link from Mellanox
https://blog.mellanox.com/2016/03/25-is-the-new-10-50-is-the-new-40-100-is-the-new-amazing/
 
Joined
Dec 29, 2014
Messages
705
Thanks
164
#7
I think there must be something hardware wise that is limiting me to around 23G.
Code:
Connecting to host 192.168.252.27, port 5201
[  5] local 192.168.252.27 port 14304 connected to 192.168.252.27 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.53 GBytes  21.7 Gbits/sec   35   2.26 MBytes
[  5]   1.00-2.00   sec  2.64 GBytes  22.7 Gbits/sec    0   2.26 MBytes
[  5]   2.00-3.00   sec  2.74 GBytes  23.6 Gbits/sec    0   2.26 MBytes
[  5]   3.00-4.00   sec  2.79 GBytes  23.9 Gbits/sec    0   2.26 MBytes
[  5]   4.00-5.00   sec  2.60 GBytes  22.3 Gbits/sec    7   4.05 MBytes
[  5]   5.00-6.00   sec  2.68 GBytes  23.0 Gbits/sec    0   4.05 MBytes
[  5]   6.00-7.00   sec  2.73 GBytes  23.5 Gbits/sec    0   4.05 MBytes
[  5]   7.00-8.00   sec  2.72 GBytes  23.4 Gbits/sec   23   2.15 MBytes
[  5]   8.00-9.00   sec  2.54 GBytes  21.8 Gbits/sec    0   2.15 MBytes
[  5]   9.00-10.00  sec  2.69 GBytes  23.1 Gbits/sec    0   3.46 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  26.7 GBytes  22.9 Gbits/sec   65             sender
[  5]   0.00-10.00  sec  26.7 GBytes  22.9 Gbits/sec                  receiver

This is the faster of my two units talking to itself. I am a little confused by the retries. I haven't seen that when running iperf before. So even if the 40G is multiple lanes, it does it appear that it does bond and not load balance. It feels a little silly to whine about only being able to push 20G on the network, but I am a little disappointed. I would certainly contemplate upgrading some components, but I am not sure which ones that would be.

It does seem that 4 streams is where I get the very highest values from iperf. It drops off when I go to 8.
4 streams from faster unit.
Code:
[SUM]   0.00-10.00  sec  29.6 GBytes  25.4 Gbits/sec    0             sender
[SUM]   0.00-10.00  sec  29.6 GBytes  25.4 Gbits/sec                  receiver

4 streams on slower unit.
Code:
[SUM]   0.00-10.00  sec  23.1 GBytes  19.9 Gbits/sec    0             sender
[SUM]   0.00-10.00  sec  23.1 GBytes  19.9 Gbits/sec                  receiver
 
Last edited:
Joined
Dec 29, 2014
Messages
705
Thanks
164
#8
I think there must be something in my hardware (likely the CPU) that is limiting me. I just now decided to do an iperf test to itself on the loopback interface.
Code:
root@freenas2:/nonexistent # iperf3 -c 127.0.0.1
Connecting to host 127.0.0.1, port 5201
[  5] local 127.0.0.1 port 64550 connected to 127.0.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.56 GBytes  22.0 Gbits/sec   26   3.49 MBytes
[  5]   1.00-2.00   sec  2.41 GBytes  20.7 Gbits/sec    0   6.62 MBytes
[  5]   2.00-3.00   sec  2.77 GBytes  23.8 Gbits/sec    0   6.62 MBytes
[  5]   3.00-4.00   sec  2.37 GBytes  20.4 Gbits/sec    0   7.01 MBytes
[  5]   4.00-5.00   sec  2.35 GBytes  20.2 Gbits/sec   95   2.27 MBytes
[  5]   5.00-6.00   sec  2.43 GBytes  20.9 Gbits/sec    0   7.01 MBytes
[  5]   6.00-7.00   sec  2.36 GBytes  20.3 Gbits/sec  126   5.68 MBytes
[  5]   7.00-8.00   sec  2.48 GBytes  21.3 Gbits/sec    0   5.74 MBytes
[  5]   8.00-9.00   sec  2.42 GBytes  20.8 Gbits/sec    0   7.01 MBytes
[  5]   9.00-10.00  sec  2.44 GBytes  21.0 Gbits/sec    0   7.01 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  24.6 GBytes  21.1 Gbits/sec  247             sender
[  5]   0.00-10.00  sec  24.6 GBytes  21.1 Gbits/sec                  receiver

iperf Done.

I was able to get a little higher with 4 streams, but that seems to be as high as it will go.
Code:
[SUM]   0.00-10.00  sec  28.6 GBytes  24.6 Gbits/sec    0             sender
[SUM]   0.00-10.00  sec  28.6 GBytes  24.6 Gbits/sec                  receiver

Does that sounds reasonable? I have the fastest CPU I can get in my server although I could get more cores. I don't think that would help.
 
Joined
Dec 29, 2014
Messages
705
Thanks
164
#9
I have upgraded the CPU's in my secondary FreeNAS to match the primary, and the results were fairly consistent. I did notice some retries in iperf which seemed to be slowing things down. I was able to clear the retries by changing some sysctl settings.
Code:
sysctl net.inet.tcp.blackhole=2
sysctl net.inet.udp.blackhole=1

What tipped me off to this was the following message:
Code:
Limiting open port RST response from 236 to 200 packets/sec

I still appear to be capped in the upper 20G range even on the loopback interface, so I guess that is hardware limit. Sigh.
 
Top