Extremely slow ssh/scp over wan

Status
Not open for further replies.

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
I have two FreeNAS servers - one in NY and one in CA, with average network latency of about 70ms.

Both locations are connected by an IPSEC tunnel.

NY replicates its datasets to CA. And the replication runs at maximum 2mb/s, with average of around 1mb/s.

A simple "scp" of a file also shows the same speed.

On the other hand, linux-to-linux transfers max out the VPN router's capacity, at around 25-30mb/s (Juniper SSG520, aes128/sha)

Linux-to-FreeNAS - slow. FreeNAS-to-Linux - slow. Linux-to-Linux - fast (with defaults).

I have checked out every TCP tuning guide I could find, to no avail.

Here are the sysctls I have added so far:

kern.ipc.maxsockbuf=16777216
kern.ipc.somaxconn=1024
kern.maxfiles=204800
kern.maxfilesperproc=200000
net.inet.tcp.delayed_ack=1
net.inet.tcp.mssdflt=1460
net.inet.tcp.recvbuf_inc=262144
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=1048576
net.inet.tcp.rfc3042=1 (tried both 0 and 1 - makes no difference whatsoever)
net.inet.tcp.sendbuf_inc=262144
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.sendspace=1048576
net.inet.tcp.syncookies=0

Some guides also recommend enabling and tuning inflight settings, but:

[root@freenas2-ny] ~# sysctl net.inet.tcp.inflight.enable=1
sysctl: unknown oid 'net.inet.tcp.inflight.enable'

I'm not sure what else to do... :(
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I'm out of my league here but I expect the correct sysctl is inflight.enable. Maybe you should do a search for "inflight.enable" in these forums and read the messages. This may help.
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
I'm out of my league here but I expect the correct sysctl is inflight.enable. Maybe you should do a search for "inflight.enable" in these forums and read the messages. This may help.


Already have. No dice :(
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
Smells like you're hitting the default TCP window.

What's the RTT between the hosts?

What does /sbin/sysctl net.inet.tcp.hostcache.list produce?


I assume sysctl net.inet.tcp.rfc1323 returns 1, correct?


 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
What is the speed between local Linux and local FreeNAS, that is using LAN, not WAN ?
 

bestboy

Contributor
Joined
Jun 8, 2014
Messages
198
You could try to change the default CC algorithm NewReno to something else. NewReno is solid and perfectly fine for the use in LANs, but it might not be the best choice for WAN connections. I'd suggest to give H-TCP (loss based) or CHD (latency based) a try.

Edit: Oh and you could of course try CUBIC which is the algorithm Linux is using.
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
Smells like you're hitting the default TCP window.

What's the RTT between the hosts?

~70ms

What does /sbin/sysctl net.inet.tcp.hostcache.list produce?



Code:
[root@nas1-ny] ~# sysctl net.inet.tcp.hostcache.list
net.inet.tcp.hostcache.list:
IP address        MTU  SSTRESH      RTT  RTTVAR BANDWIDTH    CWND SENDPIPE RECVPIPE HITS  UPD  EXP
10.11.0.150        0    2852  1791ms    794ms        0    4560        0        0 1238  380 3900
10.11.0.42          0        0      8ms    14ms        0    6571        0        0  12    5 3300
10.11.0.138        0        0      5ms      9ms        0    6902        0        0    0    1 2700
10.11.0.74          0        0      6ms    10ms        0    6394        0        0    0    1 2400
10.11.0.130        0        0      1ms      1ms        0    4482        0        0  15    2 2100
10.11.0.94          0        0    108ms      1ms        0    4380        0        0  164    1 3600
10.11.2.122        0    5952    35ms    49ms        0    4381        0        0  81    1 3000
10.11.0.71          0        0      8ms    13ms        0    8143        0        0  246  18 3600
10.11.0.131        0    54312      7ms    12ms        0    55434        0        0  36    2 1200
10.11.2.119        0    8928    72ms    48ms        0    4381        0        0    3    1 2400
10.11.2.123        0    40176    96ms    28ms        0    13777        0        0  41    1    0
127.0.0.1          0        0      1ms      1ms        0    86331        0        0 26769 2123 3900
10.11.0.32          0        0    37ms    49ms        0    4380        0        0  110  11 3900
10.11.0.24          0    9068      3ms      4ms        0    21619        0        0  102  34 3600
10.11.0.92          0        0    39ms    56ms        0    4381        0        0    3    1 3000
10.11.0.80          0        0    58ms    58ms        0    4380        0        0  135    9 3000
10.11.0.28          0        0      1ms      1ms        0    5082        0        0  39    7 2100
10.11.0.96          0        0    75ms    28ms        0    4380        0        0  140    2 2100
10.11.0.45          0    53196      6ms    10ms        0    46388        0        0    6    4 3900
10.11.0.137        0        0      6ms    10ms        0    5878        0        0    3    2 3600
10.11.0.49          0    50592      8ms    12ms        0    28503        0        0  24    5 3600
10.11.0.85          0    55056    27ms    30ms        0    4381        0        0  206    4 3600
10.11.0.33          0        0      8ms    10ms        0    12877        0        0  18    7    0
10.11.0.133        0    57288      6ms    11ms        0    35669        0        0  36    2 1500
10.11.2.113        0    2976    106ms      6ms        0    5852        0        0  41    1 1500


I assume sysctl net.inet.tcp.rfc1323 returns 1, correct?


Yep.
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
You could try to change the default CC algorithm NewReno to something else. NewReno is solid and perfectly fine for the use in LANs, but it might not be the best choice for WAN connections. I'd suggest to give H-TCP (loss based) or CHD (latency based) a try.

Edit: Oh and you could of course try CUBIC which is the algorithm Linux is using.


I probably will, but they're disabled by default. And looks like I would have to reboot the server in order to enable them :(
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
You could try to change the default CC algorithm NewReno to something else. NewReno is solid and perfectly fine for the use in LANs, but it might not be the best choice for WAN connections. I'd suggest to give H-TCP (loss based) or CHD (latency based) a try.

Edit: Oh and you could of course try CUBIC which is the algorithm Linux is using.


Code:
[root@nas1-hk] ~# kldload cc_chd
kldload: can't load cc_chd: Exec format error


:(
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
You don't have some funny QOS going on, do you?
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
You don't have some funny QOS going on, do you?


Nope. Just a dumb aes128/sha IPSEC tunnel between the sites, with mss=1387 (Amazon AWS insists on it).

When I've loaded cc_cubic and enabled it - the situation got better, but still not quite at the same level as linux-to-linux. Now I'm getting just under 10 megabytes per second, which is better than 1-2 MB/s I've had before. Considering I didn't have to change a thing in Linux, it looks like FreeBSD is not very well suited for high latency, high bandwidth IP communication :(

And even with "cubic", transfers between NY and Hong Kong are still at around 320-500 kilobytes per second :( Occasionally goes up to 2 megabytes per second.

All of my sites have 2x1G uplinks. The latency from NY to HK is between 200 and 300ms. Latency from NY to CA is around 70-75ms.

BTW I also tried HTCP - Cubic seems to perform much better. Haven't tried the CHD, as I can't seem to be able to load it :(
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
I've used FreeBSD to do >1Gb/s transfers for HPC applications across the US on shared 10G links, some with relatively high latency.. It's just a matter of understanding where the bottleneck is and getting everything tuned properly..

It does suck that we've been using the same manual process for about 15 years to tune the exact same kernel options to address these issues. :p
 

bestboy

Contributor
Joined
Jun 8, 2014
Messages
198
Considering I didn't have to change a thing in Linux, it looks like FreeBSD is not very well suited for high latency, high bandwidth IP communication :(

Well, then I'm all out of good ideas. I can only provide you with some random bits for trial and error testing in case you are desperate:
  • net.inet.tcp.delayed_ack=0 (maybe there's a wicked middlebox in the path that applies reasoning based on the rate of acks?)
  • dev.igb.0.fc=0 (congestion control on link layer can interfere with TCP's CC and can be misused for link throttling)
  • net.inet.tcp.cc.htcp.adaptive_backoff=1 (when using HTCP try to keep queues filled on backoff)
I got most of these from this guide (very good read imho).

Other than that I'd suggest to do an actual measurement for root cause analysis. I guess you could take a capture with tcpdump and then check it with the nice TCP tools of Wireshark.


BTW I also tried HTCP - Cubic seems to perform much better. Haven't tried the CHD, as I can't seem to be able to load it :(
Sorry, I have actually never switched the CC algorithm, so I'm not really sure if CHD is actually supported by FreeNAS. I'm still on 9.1.1 and don't even have the HTCP module...
However, I wonder a bit about the big difference between HTCP and CUBIC. They seem to be based on the same concept and kind of similar... *shrug*
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
I've used FreeBSD to do >1Gb/s transfers for HPC applications across the US on shared 10G links, some with relatively high latency.. It's just a matter of understanding where the bottleneck is and getting everything tuned properly..

It does suck that we've been using the same manual process for about 15 years to tune the exact same kernel options to address these issues. :p


I agree - I think that by now some (all?) of the tuning settings should have been made default.

And in my case, I suspect the fact that it's all going through an IPSEC tunnel, with all the things that entails (such as fragmenting packets to fit within smaller MTU) makes things more complicated :(
 

dniq

Explorer
Joined
Aug 29, 2012
Messages
74
Well, then I'm all out of good ideas. I can only provide you with some random bits for trial and error testing in case you are desperate:
  • net.inet.tcp.delayed_ack=0 (maybe there's a wicked middlebox in the path that applies reasoning based on the rate of acks?)
  • dev.igb.0.fc=0 (congestion control on link layer can interfere with TCP's CC and can be misused for link throttling)
  • net.inet.tcp.cc.htcp.adaptive_backoff=1 (when using HTCP try to keep queues filled on backoff)

Already tried delayed_ack in both options - no effect :(

I got most of these from this guide (very good read imho).


Thanks! I'll have a look! :)


Sorry, I have actually never switched the CC algorithm, so I'm not really sure if CHD is actually supported by FreeNAS. I'm still on 9.1.1 and don't even have the HTCP module...
However, I wonder a bit about the big difference between HTCP and CUBIC. They seem to be based on the same concept and kind of similar... *shrug*


The biggest difference is that with HTCP the throughput drops for longer periods of time, and climbs up slower. With CUBIC it all happens much faster, so the average throughput is higher.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
I agree - I think that by now some (all?) of the tuning settings should have been made default.

And in my case, I suspect the fact that it's all going through an IPSEC tunnel, with all the things that entails (such as fragmenting packets to fit within smaller MTU) makes things more complicated :(


I'd still like to see the host cache output for your replication targets.

If your linux box is performing as expected, then theoretically all the intermediate "stuff" handling the tunnel is doing a good enough job that we shouldn't need to look there for problems.

Having said that, I have seen firewalls do dumb things, particularly teal colored boxes.. What platform are you using to terminate the IPSEC?

Can you put wireshark on a mirror of the switch port facing freenas?
 
Status
Not open for further replies.
Top