Very slow performance via IPSec tunneling

f3lang

Cadet
Joined
Aug 17, 2020
Messages
1
Hi everyone,

I am stuck now since several days with a gnarly problem, my brain feels now thoroughly cooked by it and I hope you can help me :)
As a setup I got a FreeNAS 11.3 system in my network:

128gb ram
2x xeon e5-2670
16x 2TB ST2000NM0001 as storage
2x on board Intel Corporation I350 Gigabit Network Connection
1x Chelsio T540 with 4x10G

Within my networks, everything works fine, but I am trying to make the system available on another location as well. For this I have configured an ipsec
site2site tunnel from the nas location to the target location. The network layout looks like this:

NAS system(192.168.30.11, one of the onboard intel NICs) ---1G---> (192.168.30.1) unifi dream machine (NAT: 192.168.25.2) ---1G---> (192.168.25.1) ubnt edgerouter (dynamic public IP) ---(250/250Mb/s)ipsec---> (public static ip) pfsense (10.9.0.1)---10G---> (10.9.0.106)target vm

The ipsec site2site connection is created between the 192.168.25.0/24 and the 10.9.0.0/16 network.

If I do a iperf from another system in the 192.168.30.0/24 network to the target VM on 10.9.0.106, everything works fine as intended. stable 250-280Mbit/s connection in both ways. The testhost in the 192.168.30.0/24 network has the IP 192.168.30.213
But when I do the same test from freenas system, the speed is crippling slow at 5-30Mb/s
I have checked and tested every path in the network and beside the link from the NAS to the 10.9.0.0/16 network every connection is performing at full line rate:
from freenas:
192.168.30.11 <--> 192.168.30.1 960-980Mbit/s
192.168.30.11 <--> 192.168.25.1 890-950Mbit/s
192.168.30.11 <--> 10.9.0.1 5-30Mbit/s
192.168.30.11 <--> 10.9.0.106 5-30Mbit/s

from testhost:
192.168.30.213 <--> 192.168.30.1 960-980Mbit/s
192.168.30.213 <--> 192.168.25.1 890-950Mbit/s
192.168.30.213 <--> 10.9.0.1 250-280Mbit/s
192.168.30.213 <--> 10.9.0.106 250-280Mbit/s

from edgerouter
192.168.25.1 <--> 10.9.0.1 250-280Mbit/s
192.168.25.1 <--> 10.9.0.106 250-280Mbit/s

from pfsense:
10.9.0.1 <--> 10.9.0.106 9.8Gbit/s

As I started running out of ideas, I started a virtual ubuntu server on the freenas system, bridged the network interface to the the same interface as the freenas
system is running the 192.168.30.11 and to my suprise the ubuntu vm again has full line rate to the target network:

from testhost:
192.168.30.246 <--> 192.168.30.1 960-980Mbit/s
192.168.30.246 <--> 192.168.25.1 890-950Mbit/s
192.168.30.246 <--> 10.9.0.1 250-280Mbit/s
192.168.30.246 <--> 10.9.0.106 250-280Mbit/s

So now freenas and the ubuntu vm are sending the traffic on the same NIC and the freenas connection through the tunnel is crippling and ubuntu can push the data at full speed.

I know, that I currently do not have a direct site2site between the 192.168.30.0/24 and the 10.9.0.0/16, the ipsec had multiple tunnels to different networks and I also tested to put the NAS into one of those networks, but the result stayed the same and I figured, that the current configuration gave the best debugging abilities.
I have already removed all other networks here out of desparation

I have done traffic captures of both freenas and the ubuntu VM and the traffic looks exactly the same for both machines.
I've reached now google page 5 for nearly everything I could imagine, turned hw offloading off and on in every place possible, changed tcp rmem, wmem, sendbuffer, receivebuffer, tried different congestion algorithms and I am now really running out of options.

Maybe some of you network geniuses has an awesome idea how to further debug or solve that problem.
I will be happy for every straw and bit of help you can give me :)

Thanks,
f3lang
 
Joined
Dec 29, 2014
Messages
1,135
What are you using to copy files (SMB, NFS, SFTP, rsync)? My guess is one of two things. Either you are having problems because of the loss of some of the available MTU due to tunneling, or it is a TCP window thing. If the TCP stream of the sending host only allows for so much data to be unacknowledged (the TCP window) size, then it will wait to send more even if there is bandwidth available. That seems like a possibility since you said iperf gives you the expected throughput. You could perhaps work some of that backwards if you look at how often packets are transmitted. That assumes you aren't seeing re-transmits or ICMP 'packet too large' messages.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Check your captures. In all likelihood, the DF (don't fragment) bit is set, and your low throughput is due to FreeNAS conforming to that bit and dropping packets whose MTU makes them too large to fit through the tunnel.
 
Last edited:
Top