FN11, 10 & 40Gbe and dismal performance with replication

Status
Not open for further replies.

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
System A:
Supermicro X10SRH-CLN4F Motherboard
1 x Intel Xeon E5-2640 V3 8 Core 2.66GHz
4 x 16GB PC4-17000 DDR4 2133Mhz Registered ECC
12 x 4TB HGST HDN724040AL 7200RPM NAS SATA Hard Drives
2 x 6 Drive RAIDZ2 VDEVs
LSI3008 SAS Controller - Flashed to IT Mode (Firmware Version 12.00.02.00)
LSI SAS3x28 SAS Expander
LSI9211-8i SAS Controller - Flashed to IT Mode (Firmware Version 20.00.02.00)
(connects to external JBOD enclosure)
Dual 920 Watt Platinum Power Supplies
16GB USB Thumb Drive for booting
Chelsio T580-SO-CR Dual 40Gbe NIC (Replication Connection to backup FreeNAS server)
Chelsio T520-SO-CR Dual 10Gbe NIC (Data connection to Plex server & media management server)
FreeNAS-11.0-U1 (aa82cc58d)

System B:
EMC Isilon SuperMicro X8DT6-A Motherboard
2 x Intel Xeon E5603 4 Core Processors
96GB (12 x 8GB) DDR3 PC10600 (1333) REG ECC Memory
2 x SanDisk 8GB SATADOM Boot Drives (Mirrored)
LSI SAS3081E-R w/Expander
36 x 3TB 7200RPM Hitachi HDS72303 Hard Drives
4 x 9 Drive RAIDZ2 VDEVs
Dual 1200 Watt Gold Power Supplies
Chelsio T580-SO-CR Dual 40Gbe NIC (Replication Connection to primary FreeNAS server)
Chelsio T520-SO-CR Dual 10Gbe NIC (Data connection to Plex server & media management server)
APC Smart-UPS RT 3000
FreeNAS-11.0-U1 (aa82cc58d)



I recently upgraded both of my FreeNAS servers from Corral to 11.0-U1. My network utilizes 10Gbe and 40Gbe connections using the Chelsio 5xx series cards. All cards are in x8 or x16 slots on my motherboards listed above. My goal was the rapid replication of backup data to my primary machine in the event that I had a failure of the primary (System A). We are talking about around ~40TB of live data. Both systems are in the same rack connected via twinax, no switches in the mix on the backend.

Since I did the upgrade to 11, I wanted to retest my network and make sure everything was still working as it should.

Upon deployment, I ran the following tests:

1) System B (Backup System) - Local system copy of data utilizing a DD to a new uncompressed dataset. This was both a read and write to verify that the configuration of the system would support the overall read and write speeds that I wanted.

Code:
root@plexnasii:/mnt/vol1/test # zfs get all vol1/test
NAME	   PROPERTY					  VALUE						 SOURCE
vol1/test  type						  filesystem					-
vol1/test  creation					  Sun Jul 16 11:44 2017		 -
vol1/test  used						  637G						  -
vol1/test  available					 71.6T						 -
vol1/test  referenced					637G						  -
vol1/test  compressratio				 1.00x						 -
vol1/test  mounted					   yes						   -
vol1/test  quota						 none						  default
vol1/test  reservation				   none						  default
vol1/test  recordsize					128K						  default
vol1/test  mountpoint					/mnt/vol1/test				default
vol1/test  sharenfs					  off						   default
vol1/test  checksum					  on							default
vol1/test  compression				   off						   local
vol1/test  atime						 on							default
vol1/test  devices					   on							default
vol1/test  exec						  on							default
vol1/test  setuid						on							default
vol1/test  readonly					  off						   default
vol1/test  jailed						off						   default
vol1/test  snapdir					   hidden						default
vol1/test  aclmode					   passthrough				   inherited from vol1
vol1/test  aclinherit					passthrough				   inherited from vol1
vol1/test  canmount					  on							default
vol1/test  xattr						 off						   temporary
vol1/test  copies						1							 default
vol1/test  version					   5							 -
vol1/test  utf8only					  off						   -
vol1/test  normalization				 none						  -
vol1/test  casesensitivity			   sensitive					 -
vol1/test  vscan						 off						   default
vol1/test  nbmand						off						   default
vol1/test  sharesmb					  off						   default
vol1/test  refquota					  none						  default
vol1/test  refreservation				none						  default
vol1/test  primarycache				  all						   default
vol1/test  secondarycache				all						   default
vol1/test  usedbysnapshots			   0							 -
vol1/test  usedbydataset				 637G						  -
vol1/test  usedbychildren				0							 -
vol1/test  usedbyrefreservation		  0							 -
vol1/test  logbias					   latency					   default
vol1/test  dedup						 off						   default
vol1/test  mlslabel													-
vol1/test  sync						  standard					  default
vol1/test  refcompressratio			  1.00x						 -
vol1/test  written					   637G						  -
vol1/test  logicalused				   637G						  -
vol1/test  logicalreferenced			 637G						  -
vol1/test  volmode					   default					   default
vol1/test  filesystem_limit			  none						  default
vol1/test  snapshot_limit				none						  default
vol1/test  filesystem_count			  none						  default
vol1/test  snapshot_count				none						  default
vol1/test  redundant_metadata			all						   default
vol1/test  org.freenas:description									 local
vol1/test  org.freenas:permissions_type  PERM						  inherited from vol1


Code:
[PLEXNAS-II LOCAL WRITE TEST]
root@plexnasii:/mnt/vol1/test # dd if=/dev/zero of=testfile bs=10M count=50000
50000+0 records in
50000+0 records out
524288000000 bytes transferred in 558.926291 secs (938027086 bytes/sec)
7.5Gbits/sec

[PLEXNAS-II LOCAL READ TEST]
root@plexnasii:/mnt/vol1/test # dd if=testfile of=/dev/null bs=10M count=50000
50000+0 records in
50000+0 records out
524288000000 bytes transferred in 506.344610 secs (1035437111 bytes/sec)
8.28Gbits/sec



I ran the exact same tests on System A (primary) also to a new, uncompressed dataset:

Code:
[PLEXNAS LOCAL WRITE TEST]
root@plexnas:/mnt/vol1/test # dd if=/dev/zero of=testfile bs=10M count=50000
50000+0 records in
50000+0 records out
524288000000 bytes transferred in 621.702142 secs (843310589 bytes/sec)
6.746Gbits/sec

[PLEXNAS LOCAL READ TEST]
root@plexnas:/mnt/vol1/test # dd if=testfile of=/dev/null bs=10M count=50000
50000+0 records in
50000+0 records out
524288000000 bytes transferred in 631.273400 secs (830524460 bytes/sec)
6.644Gbits/sec



My next step was to test overall connectivity with iperf. Results were 16Gbits/second in both directions on the primary 40Gbe network card. Since this was well below the 40Gbe target, I engaged Chelsio to determine the reason for the lower performance on these cards. Chelsio support has been able to reproduce the problem in their lab and is working on a reason. For reference, my Chelsio 10Gbe cards are showing 9.9Gbits/sec between the same machines.

Becuase I know that iperf does not test the actual throughput of data between the systems, I created an NFS mount point on one of the systems and transferred test data (in my case about 500gb of movies) using the cp command across this connection from System A to System B. In actual usage I was seeing about 4Gbits/sec. I was seeing this across both my 40Gbe connection as well as my test 10Gbe connection on the same machine. So at this point, I am assuming that I have hit some sort of physical limitations on the actual copying of data from one system to the other based on the system's ability to a) read the data off the drives in system a, b) cp the data across the NFS link to system b and c) write that data to the drives on system b. This copy was also to the uncompressed test dataset on system b and back again. I also used dd from one system to the other with nearly the same results.

I would have expected to see close to the lowest read and write speeds of the slowest box (in this case plexnas) at around 6+Gbits/sec but that didn't happen and I realize that testing is not real world data transfer. Still, 4Gbits/sec was acceptable at this point.

I did swap out the Chelsio cards for 10Gbe Intel cards that I had been using for another project and ran tha same test as above and was astounded that the 4Gbits/sec speed dropped to about 1.2Gbits/sec with everything else being exactly the same. Needless to say I think the Chelsio cards outperform the Intel cards at this point. Since I was not going to use the Intel cards, I did not bother to try and delve into why the Intel cards were performing the way they were.

So now that all of my testing was complete (actually 4 or 5 times in both directions) I set up replication with no compression (since I am moving video files compression did not seem useful), no kbB/s limits and with no encryption on the connection. I received a big warning when I selected this but since I am on my local network, security was not a concern.

The replication started and my jaw about hit the floor: 800Mbits/sec. I deleted the replication process and recreated it thinking I had done something wrong. Same results, same speed. 800Mbits/sec. I deleted the replication and tried it with compression, no better (not that I expected it to be). CPU and system load on both boxes are low, my sending box more so than my receiving box. My drives are not doing anywhere near the work they were doing during the cp process.

On Corral, with the exact same hardware, exact same drives, exact same network configuration I was seeing 2.5Gbit/sec during replication between my two corral servers. (See Graphs and discussion here). I thought that was too slow given my network and hardware performance testing results, but now to go from 2.5Gbits/sec to 800Mbits/sec is another shock.

I am looking for any ideas, thoughts, suggestions, etc on why I am seeing such dismal performance on my replication.

Thanks for an insight!
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
I share your pain... I haven't run any large replication tasks lately, but I run a nightly rsync job that only gets ~340Mb/s between a pair of FreeNAS 11.0-U1 servers, both equipped with Intel X520-DA1 10Gb/s cards. And I'm sure it's not the network configuration: iperf testing confirms >9.3Gb/s rates both ways between the two servers. Very disappointing.
 

bigphil

Patron
Joined
Jan 30, 2014
Messages
486
I wonder if it's related to this bug report?
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
I wonder if it's related to this bug report?
He he... I've been watching that bug report (#24405) for some time! It deals strictly with replication, and a fix most likely won't have any affect on my abysmal rsync transfer rates.
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
Not sure, it might be related. I did do testing with dd across an NFS link between the systems, no problems at all and saw 4Gbits/sec. Not as high as I hoped but far better than 800Mbits/sec. So if this is related to (#24405), then it is specific to the dd command only as it relates to zfs send as normal dd does not exhibit this behavior, at least not on my systems.
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Not sure, it might be related. I did do testing with dd across an NFS link between the systems, no problems at all and saw 4Gbits/sec. Not as high as I hoped but far better than 800Mbits/sec. So if this is related to (#24405), then it is specific to the dd command only as it relates to zfs send as normal dd does not exhibit this behavior, at least not on my systems.
The problem isn't with dd per se it has to do with the buffering and rebuffering during replication that are causing artificial bottlenecks.

"the read/write flow shows a constant multiplication of operations with no buffering gains due to multiple rebuffering operations."

That being said, you should be getting better than 800Mbit/sec transfer rates. I have a 10Gb direct link replication and average about 2Gbit/sec transfers between them.
 
Last edited by a moderator:

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
Well, and as luck would have it when I was running FN10, I was getting 2.5Gbit/sec on my 40Gbe links. Now that I "upgraded" to FN11 with the absolutely exact same hardware and configuration, the number has dropped to the 800Mbit/sec I am seeing now...

Something clearly has changed in how the replication is done or the underlying processes or maybe network drivers or the planets are no longer aligned or something.....

Very frustrating.... I am sure glad I spent money to upgrade the network cards to 10Gbe and 40Gbe :smile:
 

bigphil

Patron
Joined
Jan 30, 2014
Messages
486
Something clearly has changed in how the replication is done or the underlying processes or maybe network drivers or the planets are no longer aligned or something.....

I was thinking it may be something like this too. Wasn't Corral on an even newer FreeBSD version than FreeNAS 11? If so, it's possible that it had newer network drivers baked in.
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
I'm thinking you may be right, but someone with better knowledge of what was under the hood of FN10 would have to chime in with that info. I cannot imagine that they would rewrite how replication worked and maybe the driver (or network stack itself) was better in FN10 than in FN11.

I am wondering if @cyberjock or @jgreco might know the answer to that....
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
As it was years ago in the early 1GbE days, relatively minor things can gum up the works when you're dealing with high bandwidth hardware and software. It isn't unusual for some seemingly trite change to have a ripple effect. We've had an entire generation of users who grew up with hardware that was fast enough to slam full gigE in many cases, and the delicate art of tuning has been somewhat lost or forgotten. This comes back to bite us with the 10GbE and 40GbE. Buffer sizes, congestion algorithms, cut-through switching, client issues, and all sorts of other things affect all of this. It is also very helpful to remember that disks are inherently slow, and that the performance of pools tends to degrade over time, until you reach a steady state. Debugging this stuff is basically a matter of trying to identify bottlenecks and eliminate them.

I really haven't done too much with FreeNAS post-9.3 yet because things are so damn busy here. I've got a pile of ten (yes, 10) Dell R510's destined to become FreeNAS filers as soon as one of my clients lines up some disks for them. If anyone knows anyone interested in donating dozens of 2TB-4TB sized disks to a worthy nonprofit whose work benefits us all, drop me a line.
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
@jgreco Well said as usual.

In my case, through repetitive testing I have tried to determine actual capabilities of my 10Gbe and 40Gbe networks. Since my "networks" were actually back-to-back twinax connections, it reduced a lot of variables. In my case, everything works OK (ie - 4Gbit/sec on a 10Gbe card) until I try to use replication, then it tanks. What is very interesting to me is that FreeNAS10 performed much better on the exact same hardware, exact same cards, exact same cables and exact same tests.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That suggests three areas to consider, Since I don't have any insight I'm simply telling you what I'd look at and why.

1) The ZFS pool itself (both target and destination). ZFS is blazing fast especially on writes for a brand new pool because there's no fragmentation and allocation is guaranteed to be easy. As a pool has its space consumed and then freed and then consumed and then freed (etc), speed tends to fall off as the disks need to do more seeks, and this will have a very large impact. This is especially problematic if you are dealing with things such as thousands of small files, database files, VM storage, etc. As an example, you can't take your FreeNAS 9 pool, import it into FreeNAS 11, and then claim that it must be FreeNAS 11 that's being slow, because it *could* be pool aging. You would need to roll back to FreeNAS 9 on both sides and test it that way again to be able to rule out the pool itself as a factor. I wouldn't be shocked to find out that suddenly a reinstalled FreeNAS 9 was "much slower" and if so your problem may be fragmentation on the pool.

2) ZFS itself could have some change that's impacted performance. This could range from ZFS tuning to issues such as insufficient memory to cache metadata. As a pool grows, the amount of metadata that needs to be analyzed and/or updated grows as well. Defaults for some ZFS tunables have changed over the years. Implementation in code has changed, generally for the better.

3) FreeNAS itself is mostly a management stack on top of a FreeBSD system, but it is certainly possible that there have been implementation changes that could impact speed. Replication is basically a bunch of userland foo interacting with ZFS foo across the network. Identify the ssh command being used for replication and see if you can actually sustain a high speed data transfer via a similar command with scp. Maybe you need a less-intensive encryption protocol in SSH, or to disable SSH compression. Maybe you need to make adjustments to the TCP sendsize and recvsize defaults. Hopefully you've already done some testing with iperf on both sides? Does netstat -an show large amounts of data queuing and dequeuing rapidly? Generally that's a good-ish sign. If it shows the system slamming up against the max sendsize consistently, that's showing you where there's a bottleneck.

Complex systems can be hard to debug, but on the flip side, if you're persistent, usually you can find the issue.
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
Those are all good point but I do not think any of those are the issue except maybe #3. I can mount an NFS share and copy directly to it at 4Gbits/sec. That rules out two of the three of your ideas above or the problem would be seen anytime I am writing to the pool regardless of the mechanism with which I do it (cp, dd, ssh, etc).

I think you are right with #3, I think the issue is how replication is happening and the commands that it is using to do so. I have compression shut off in the replication settings (video files) and also encryption shut off in the replication settings. When I went 10Gbe originally I put in the tuneables you mentioned but again, the speed loss is the worst only when replicating.

The testing I did was as follows:
1) iperf both ways - 16Gbits/sec on my 40Gbe interfaces and 9.8Gbits/sec on my 10Gbe interfaces. On the 10Gbe interfaces, testing using both Intel and Chelsio 10Gbe cards.
2) Local system read and write tests to uncompressed data set utilizing dd
3) local system read and write to existing dataset utilizing video files
4) Remote mounted NFS shares from FreeNAS and read from one dataset video files and write to another dataset with same video files. 4Gbits/sec
5) Man in the middle read/write test. System connected to both FreeNAS servers via 10Gbe, NFS mount from each server, copy from one mount point to the other from the middle machine. 4Gbits/sec

In the case of 2 - 4 above, the slowest transfer rates were over 6Gbits/sec.

Before I upgraded from Corral I was getting 2.5Gbits/sec during replication with the same pools, same data, same hardware, so something has changed.

I guess if I knew the exact commands the zfs send and receive were using to do the replication I could run them manually but I suspect after all the testing above that it is a problem with replication. I have two more servers enroute for another client and once those arrive I can install Corral and see if I can still get the 2.5Gbits/sec on replication I was seeing when I had Corral installed on my two servers, but I don't want to downgrade back to Corral on my production and backup servers.
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Those are all good point but I do not think any of those are the issue except maybe #3. I can mount an NFS share and copy directly to it at 4Gbits/sec. That rules out two of the three of your ideas above or the problem would be seen anytime I am writing to the pool regardless of the mechanism with which I do it (cp, dd, ssh, etc).

I think you are right with #3, I think the issue is how replication is happening and the commands that it is using to do so. I have compression shut off in the replication settings (video files) and also encryption shut off in the replication settings. When I went 10Gbe originally I put in the tuneables you mentioned but again, the speed loss is the worst only when replicating.

The testing I did was as follows:
1) iperf both ways - 16Gbits/sec on my 40Gbe interfaces and 9.8Gbits/sec on my 10Gbe interfaces. On the 10Gbe interfaces, testing using both Intel and Chelsio 10Gbe cards.
2) Local system read and write tests to uncompressed data set utilizing dd
3) local system read and write to existing dataset utilizing video files
4) Remote mounted NFS shares from FreeNAS and read from one dataset video files and write to another dataset with same video files. 4Gbits/sec
5) Man in the middle read/write test. System connected to both FreeNAS servers via 10Gbe, NFS mount from each server, copy from one mount point to the other from the middle machine. 4Gbits/sec

In the case of 2 - 4 above, the slowest transfer rates were over 6Gbits/sec.

Before I upgraded from Corral I was getting 2.5Gbits/sec during replication with the same pools, same data, same hardware, so something has changed.

I guess if I knew the exact commands the zfs send and receive were using to do the replication I could run them manually but I suspect after all the testing above that it is a problem with replication. I have two more servers enroute for another client and once those arrive I can install Corral and see if I can still get the 2.5Gbits/sec on replication I was seeing when I had Corral installed on my two servers, but I don't want to downgrade back to Corral on my production and backup servers.
Here is a snip of one of my replication tasks running on freeNAS 11.0-U1.
Replication.JPG

I average just over 2Gbit/sec on replication tasks with a direct 10Gb link between the two servers (server 1 and server 2 in my sig). During the replication process sshd is running at about 88% cpu and there are two dd processes running about the same. The devs are aware there is a problem with the datastream efficiencies but I haven't seen much progress on the ticket. There is something different going on in your environment as I consistently get just over 2Gb on the transfer.

Edit: Here is an iperf between the direct link interfaces.
Code:
Welcome to FreeNAS
root@CLNAS02:~ # iperf -c 172.16.1.10 -t 30
------------------------------------------------------------
Client connecting to 172.16.1.10, TCP port 5001
TCP window size: 8.01 MByte (default)
------------------------------------------------------------
[  3] local 172.16.1.20 port 52441 connected with 172.16.1.10 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-30.0 sec  34.6 GBytes  9.90 Gbits/sec
 
Last edited:

bigphil

Patron
Joined
Jan 30, 2014
Messages
486
I guess if I knew the exact commands the zfs send and receive were using to do the replication I could run them manually but I suspect after all the testing above that it is a problem with replication.

This guide will help you with manually sending/receiving zfs data.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Wasn't Corral on an even newer FreeBSD version than FreeNAS 11? If so, it's possible that it had newer network drivers baked in.
No. FreeNAS 11 is also based on FreeBSD 11. It's actually a later revision of FreeBSD 11 than that used by Corral.
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
Here is a snip of one of my replication tasks running on freeNAS 11.0-U1.
View attachment 19520

I average just over 2Gbit/sec on replication tasks with a direct 10Gb link between the two servers (server 1 and server 2 in my sig). During the replication process sshd is running at about 88% cpu and there are two dd processes running about the same. The devs are aware there is a problem with the datastream efficiencies but I haven't seen much progress on the ticket. There is something different going on in your environment as I consistently get just over 2Gb on the transfer.

Edit: Here is an iperf between the direct link interfaces.
Code:
Welcome to FreeNAS
root@CLNAS02:~ # iperf -c 172.16.1.10 -t 30
------------------------------------------------------------
Client connecting to 172.16.1.10, TCP port 5001
TCP window size: 8.01 MByte (default)
------------------------------------------------------------
[  3] local 172.16.1.20 port 52441 connected with 172.16.1.10 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-30.0 sec  34.6 GBytes  9.90 Gbits/sec

Yep, I get great performance with iperf and with a normal copy across NFS (4+ Gbits/sec), it's only replication where the problem exists.

Code:
root@plexnas:~ # iperf -c plexnasii -t 30
------------------------------------------------------------
Client connecting to plexnasii, TCP port 5001
TCP window size: 2.01 MByte (default)
------------------------------------------------------------
[  3] local 10.0.12.1 port 51133 connected with 10.0.12.2 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-30.0 sec  34.2 GBytes  9.78 Gbits/sec
 

bigphil

Patron
Joined
Jan 30, 2014
Messages
486
No. FreeNAS 11 is also based on FreeBSD 11. It's actually a later revision of FreeBSD 11 than that used by Corral.
Good to know. I never used Corral so wasn't sure what the underlying FreeBSD version was.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So I've been doing some of this testing with Nick Wolff (the person that started bug 24405). Here's some tidbits:

1. Yes, replication seems to be slower than we "would expect it to be". We're working on various aspects of the replication code to, hopefully, resolve the bottleneck, assuming the bottleneck is with our implementation per autorepl.py.
2. We do recognize that the replication code could use some updating. In particular, restartable replication is something we want to integrate. ETA: Not known by me.
3. Between 2 top-of-the-line TrueNAS Z35s I've done some testing and found that there is a possibility that ZFS block fragmentation is a real killer for throughput. In my testing, a real-world production box, replication with netcat (so none of the autorepl.py stuff to possibly bottleneck it) to a new zpool could only get about 500MB/sec peak on the network. The average is faster than using autorepl.py, but it was still not as fast as we were expecting. In this case we had dedicated 10Gb links between the source and destination, and still were disappointed with only about 50% utilization on 10Gb. I was fully expecting to hit at least 7Gb/sec if not more for the transfer with netcat, but that isn't happening for some reason. The zpool does seem to be staying quite busy- now whether it's 60% busy or 100% busy is very difficult to ascertain. So pinpointing it to the zpool isn't trvial to do.

In summary, there is definitely some improvements we know we "should" make (like fixing the performance bottlenecks) and improvements we know "need" to be added (like restartable replication), but I think there's a bit more going on than meets the eye. In any case, we are working on the problems and hoping that we can resolve these issues soon and satifactorily. I know that iXsystems is selling larger and larger systems, and telling customers "yeah, I know you're replication 200+TB to your backup system, but we are sorry that you can only do 350MB/sec" isn't going to go very far". iXsystems sells large multi-PB systems, and having a 300MB/sec transfer rate makes replication nearly useless. ZFS replication is one of our great features as it makes backing up data very fast and easy. But it's not going so fast and we need to find and correct the reasons why. ;)

If memory serves me right, 11.1 should have some changes incorporated, but that is quite some time off obviously. If you want to look at what we've changed so far and see if it helps you, try reading the changes at https://bugs.freenas.org/projects/f...cd3190192bd3af6c73b9e346be59a3400889595f/diff as those are the current changes for 11.1.

As for 10Gb and 40Gb throughput testing with iperf, I know that TrueNAS has a lot of very specific optimizations for our specific 10Gb and 40Gb NICs to ensure good throughput. I don't have easy access to those at the moment, but I regularly have been able to saturate 10Gb with iperf with a single processing thread without problems, and on 40Gb, I can do it with about 3 threads (sometimes 2 depending on the hardware switch in between). So 16Gb/sec doesn't seem to be completely terrible in my experience as 3x that would more than saturate 40Gb.

Hope this helps clear up some of what is going on "behind the scenes".
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
@cyberjock

Great info, so it seems with iperf I am seeing what I am supposed to be seeing, so I guess I will live with what I have until 11.1 comes out and see if there is any better improvement.
 
Status
Not open for further replies.
Top