Best Way to Transfer 20TB Volume QNAP QuTS to TN-SCALE?

Maximilious

Dabbler
Joined
Jun 20, 2023
Messages
15
Hey All,

I'm in the middle of a migration from my 40TB QNAP to a new TrueNAS SCALE server. I'm about half-way through my data transfer when I noticed that my rsync speeds are very slow compared to a different rsync I had to do of the same data last month between a Buffalo and QNAP device. On the latter I was transferring roughly 4TB in 12 hours, but now I'm transferring 1TB in 12 hours - painfully slow.

My system is in my signature, I do have LZ4 compression on the destination dataset with deduplication disabled. The server has 10GB fiber (verified speed) and the source QNAP has a 2.5GB uplink (also verified). They are on different switches but I've also verified the 10GB link between has correct speed as well. My iPerf3 test between systems is pulling 1.5GB transfer speed, but my rsync is capping around 17MB max.

The rsync command I'm using is the following:
sudo rsync -avhtP --stats --progress --exclude '.@__thumb*' --exclude '@Recycle' --exclude '@Recently-Snapshot' <QNAPsource> <TrueNASdestination>

I'm running this from SSH session with mounted folders through fstab. Perhaps this is not the best way to handle as it's not through the GUI but it is the method I was used to. I also recently read about the "zfs send" command but have not done any research into it yet. Should I perhaps use this method of migration instead?

It's also worth mentioning that the QNAP source unit is running QuTS Hero with native zfs support, and I do have compression and dedupe enabled on some of those data sets, which perhaps is also part of the issue. My TrueNAS CPU does not seem crunched at all but I've not investigated my QNAP resources either.

I'm coming up on my largest 20TB volume (not deduped or compressed on source unit) and do not want to suffer through a long transfer time of a month at the current state. Any suggestions would be appreciated!
 

samarium

Contributor
Joined
Apr 8, 2023
Messages
192
For bulk transfer with ssh I would ensure that the fastest encryption setting is used, and compression is disabled.
This can be done using rsync -e 'ssh -C cipher -o compression=no' or adding the options into ~/.ssh/config for that host.
You might need to run ssh -Q cipher on each end to find matching ciphers.
Many years ago you could disable cipher or use arcfour, but neither of those seem to be generally available now.
No real idea how it relates to the previous run there it was faster, but maybe the Buffalo accepted a more antiquated, ie faster and less secure cipher. Maybe you can run ssh -Q cipher on the Buffalo and check.
You could also try turning on the rsync server, which would allow you to rsync without encryption.
Trying too many options might chew up more time that leaving the transfer alone would use.

https://blog.twogate.com/entry/2020/07/30/benchmarking-ssh-connection-what-is-the-fastest-cipher interesting link, also tests other crypto components which may be part of the data path
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
I don't think zfs send will work from QNAP to TrueNAS, as QNAP uses their own, modified version of ZFS.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
I have not engaged in bulk transfers via rsync in a while, so please bear with me. But I did recently have the pleasure of doing so via ZFS send using a SSH connection + netcat enabled. Between two TrueNAS servers with a single HDD VDEV ea, I was getting between 200-400MiB/s transfer speeds using 10GbE fiber via a single switch. My recollection of past ZFS sends using SSH only was at a lower 100MiB/s or so w/the same data / hardware / software settings.

In my limited experience, rsync is very slow when it comes to transferring many tiny files because it seems to check each one individually, and IIRC TrueNAS also confirms individually that each bit has been written before the next file can come (Sync and SLOG notwithstanding).

When I built my new server, I compressed all the old backups of file systems I had stored on the NAS into sparsebundles (MacOS) to obfuscate the ridiculous # of tiny files each OS ships with these days. That significantly reduced the number of files to transfer, which in turn meant that rsync spent more time transferring data and less time sending / receiving confirmation that the file had been transferred.

That latency issue can also be addressed in a very limited manner with a faster connection. So, on 10GbE, the transfer will be quicker than Gigabit, but the benefit is marginal. Writing many small files is a pain for most OS' that care about file integrity like TrueNAS unless the pool features a sVDEV with fast SSDs or consists entirely of SSDs.

But all in all, I'd research if you could ZFS pull the data into the new NAS using a ssh+netcat connection.
 

Maximilious

Dabbler
Joined
Jun 20, 2023
Messages
15
For bulk transfer with ssh I would ensure that the fastest encryption setting is used, and compression is disabled.
This can be done using rsync -e 'ssh -C cipher -o compression=no' or adding the options into ~/.ssh/config for that host.

Thanks all for your responses!

I'm running my next dataset with this option to disable encryption and speeds have definitely improved. The next data set is much smaller files but I'm seeing some larger files flashing by around 60MB/s before they finish - just saw another at 250MB/s flash by. I might hit higher speeds on larger files, this dataset is only about 300GB and all smaller files (container configurations).

The final 20TB volume is movies with an average size of about 1.5GB a file, so we'll see what speeds we hit after this last 300GB is moved - hopefully done very soon this morning!
 

samarium

Contributor
Joined
Apr 8, 2023
Messages
192
@Constantin raises a good point about netcat, maybe you could use that rather than ssh as a connector? Or he coud explain what he meant by ssh+netcat? I haven't tried it, and I don't know what the actual requirements are for the "rsh" program for rsync, if unidirectional is ok then netcat would be fine.
If bidirectional is required, and if you can simulate bidirectional with 2 netcat connections, so maybe use socat, which I'm not that familiar with.
Either way would probably take out the encryption and compression layers, which would reduce CPU requirements, and hopefully increase throughput.
You could use tar/cpio and pipe them into a netcat to the other side then netcat piped into tar/cpio for extraction on the other side, definitely one way, but a broken connection isn't as recoverable as rsync would be.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
When setting up a replication task between two TrueNAS machines (which uses ZFS send or pull) the connection is via SSH by default. However, you can elect to change that to SSH+Netcat which usually yields performance improvements. Note: for my push configuration, I set its netcat location to LOCAL.

So glad to hear the transfers are going quicker now. File size is a huge factor for rsync. Unfortunately, I didn’t see any options to enable netcat in rsync replication, I guess it’s not available?
 

samarium

Contributor
Joined
Apr 8, 2023
Messages
192
Do you know what ssh+netcat actually does? Does it setup 1or2 netcat sessions locally, then ssh to the remote and setup the compliment thus connecting them? That would make some sense, but for the OP, sounds like he doesn't have the option to use the replication, so needs to know the nuts-and-bolts.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Apologies, no I don’t know how the middleware under the UI implements SSH+netcat for the ZFS send replication task.
 

Maximilious

Dabbler
Joined
Jun 20, 2023
Messages
15
I wanted to report back that after starting my larger dataset around 9AM this morning it's already written 4TB in 8 hours vs my previous session of 4TB in 12 hours, so the encryption was definitely a factor there. I forgot to mention that even though I do have quite a few small files in my other datasets that the 4TB seemed like a pretty consistent benchmark during my previous transfer. The Buffalo unit had a Windows Server OS on it, so I'm not sure how rsync would have handled that specifically, possibly no encryption by default but I'm not certain.

I've never heard of netcat so that's new to me entirely. I will need to send this 20TB of data to a new pool as I'll be installing the disks from my QNAP into my TrueNAS server after the data has been fully transferred over, so I'll perhaps use the zfs send or netcat for that operation (I assume zfs send since both pools will be local to the system by that point though).

Thanks for everyone's input!
 
Top