TrueNAS Scale Slow Replication

xeroiv · Dec 26, 2022

Hi there, hoping someone can help me understand what more I can do to improve my replication speeds between two TrueNAS-Scale machines (VM's hosted on Proxmox servers). I am able to get acceptable iperf3 results between the two sides but my replication speeds top out around 2Gbps according to both Unifi and the TrueNAS network reports. The replication job is set to SSH+Netcat and no encryption is used.

The hosts are DL380e G8 servers with 2xE5-2430 and 32GB of ram. Each TrueNas has the an lsi 9208i passed through to it with 12 HDD attached. Sender is setup as 6xmirrored vdevs and the receiver is 1x12 raid2z vdev. Any help that can be given on other things that I can try to improve upon this or suggestions on what I can do to find where the bottleneck is greatly appreciated.

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 9.44 GBytes 8.11 Gbits/sec 2426 sender
[ 5] 0.00-10.04 sec 9.44 GBytes 8.08 Gbits/sec receiver

jgreco · Dec 26, 2022

xeroiv said:
Sender is setup as 6xmirrored vdevs and the receiver is 1x12 raid2z vdev.

Set the receiver up as 6x mirrors and see if it gets better. The 1x RAIDZ2 vdev is likely to be a bottleneck; a RAIDZ vdev is limited in IOPS to something proportional the the IOPS of the slowest member device of that vdev (so maybe only 200 IOPS), while a 6x mirror vdev will be able to sustain up to about 6x write IOPS (1200 IOPS) or 12x read IOPS (2400 IOPS). I have picked a typical number of IOPS for a HDD and yours may not be exactly 200 per member device.

You haven't described the content that you are trying to replicate. Small files, database, or VM block data is going to naturally be much less performant in replication due to all the seeks.

Also be aware that this will likely perform better on bare metal. Hypervisor layers add a bit of latency, and this limits performance.

xeroiv · Dec 26, 2022

jgreco said:
Set the receiver up as 6x mirrors and see if it gets better. The 1x RAIDZ2 vdev is likely to be a bottleneck; a RAIDZ vdev is limited in IOPS to something proportional the the IOPS of the slowest member device of that vdev (so maybe only 200 IOPS), while a 6x mirror vdev will be able to sustain up to about 6x write IOPS (1200 IOPS) or 12x read IOPS (2400 IOPS). I have picked a typical number of IOPS for a HDD and yours may not be exactly 200 per member device.

You haven't described the content that you are trying to replicate. Small files, database, or VM block data is going to naturally be much less performant in replication due to all the seeks.

Also be aware that this will likely perform better on bare metal. Hypervisor layers add a bit of latency, and this limits performance.

Thanks for the advice on the receiving pool setup. Since the drives are 2TB each on the receiving machine vs 4TB each on the sending machine I'm not sure I can allocate 4 more drives to parity and maintain enough storage for the target snapshots. What is a good method to use to test the sustained read or write performance of each pool to test for the IOPS bottleneck? I suppose one way for be to try the replication job in reverse if only the write performance of the RaidZ2 pool is in question. I might be able to find some SSD to use as a test as well if that would tell me anything.

The snapshots are sending a backup of my mixed media folders so it includes lots of small files such as mp3s as well as a large chunk of mkv files that may be between 300MB to 2GB each.

I am aware there is going to be some penalty for virtualization. To back that up the iperf3 results on the hosts are about 8.92Gbps instead of the results I posted from the TrueNAS VMs. If might just be at the end of the day this is the best my pools will get until further upgrades on the drives and I suppose that is ok since the initial sync was completed incremental backups are only taking a few minutes due to the relatively small amount of change in between weekly snapshots.

xeroiv · Dec 26, 2022

So appears you led me to a reasonable explanation for the slower than expected performance that I received in my setup. When I ran the replication in the reverse direction I could see a clear disparity in the performance which is likely attributed to the expected write performance of the two setups. When I pushed from the RaidZ2 setup I saw an average transfer speed of about 299 MB/s vs when I pushed from the Mirrored vdevs I only was able to get about 220 MB/s. So the relatively poor IOPS performance of a RaidZ2 setup combined with the mixture of large and small files of the replication seems like the most reasonable culprit. I don't have a desire to toss the 2TB drives until they start to fail on me so I'll just have to accept this as the performance that I can expect from this setup.

Daisuke · Dec 26, 2022

xeroiv said:
When I ran the replication in the reverse direction I could see a clear disparity in the performance which is likely attributed to the expected write performance of the two setups.

What’s the average size of the files you transfer? For example, if your files are 5MB or higher, make sure you set the dataset recordsize to 1M. Or if you’re dealing with with a MySQL database, your recordsize should be 16K. There is also an important factor, if you change the recordsize, you need to move the files to a fresh dataset, otherwise only new data written to dataset will benefit from the performance increase. See Pools and Datasets section from this thread.

xeroiv · Dec 27, 2022

Daisuke said:
What’s the average size of the files you transfer? For example, if your files are 5MB or higher, make sure you set the dataset recordsize to 1M. Or if you’re dealing with with a MySQL database, your recordsize should be 16K. There is also an important factor, if you change the recordsize, you need to move the files to a fresh dataset, otherwise only new data written to dataset will benefit from the performance increase. See Pools and Datasets section from this thread.

Thanks for the tip. According to this small script my average file size on the dataset is 68MB. Does setting the record size to 1MB vs the default 128KiB on both pools matter in this case or is the suggestion just for the receiving pool?

root@truenas[/mnt/FreeNAS-ZFS/share]# find . -type f -print0 | xargs -0 ls -l | gawk '{sum += $5; n++;} END {print sum/n/1024;}'
70655.2

Daisuke · Dec 27, 2022

xeroiv said:
Does setting the record size to 1MB vs the default 128KiB on both pools matter in this case or is the suggestion just for the receiving pool?

You need to set it on both pools. Most important, you need to create new 1M datasets, then move files from old datasets to new ones.

A simpler human readable command, which returns the same result with yours:

Code:

# find /mnt/software/ix-applications -type f -print0 | xargs -0 ls -l | gawk '{sum += $5; n++;} END {print sum/n;}' | numfmt --to=iec
43K
# find /mnt/software/ix-applications -type f -ls | awk '{s+=$7} END {printf "%.0f\n", s/NR}' | numfmt --to=iec
43K

However, your formula fails to process correctly large files, hence the search for an alternative. numfmt gives also the proper recordsize format:

Code:

# find /mnt/default/media -type f -print0 | xargs -0 ls -l | gawk '{sum += $5; n++;} END {print sum/n;}' | numfmt --to=iec
numfmt: invalid suffix in input: ‘6.45054e+08’
# find /mnt/default/media -type f -ls | awk '{s+=$7} END {printf "%.0f\n", s/NR}' | numfmt --to=iec
616M

I'm going to add that to the guide, thank you for the useful idea.

Important Announcement for the TrueNAS Community.

TrueNAS Scale Slow Replication

xeroiv

Cadet

jgreco

Resident Grinch

xeroiv

Cadet

xeroiv

Cadet

Daisuke

Contributor

xeroiv

Cadet

Daisuke

Contributor

Similar threads