windows server robocopy to freenas extremely slow

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
Hi All

I'm trying to set up a backup server with freeNAS. I basically have 1 bigass zpool of about 300TB and datasets/file systems on them for each server. ( XEON skylake W-2102 CPU, 64GB of ram, no L2ARC) The pool consists of 6 vdevs of 8 disks in raidz2 config.

The initial copy from windows to freenas went fast. About 850 MB/s and I was very pleased :)

Now that the data is there (about 100TB) it has gotten extremely slow. It maxed out at about 40 MB/s for a single server and 60 MB/s if I run it from all servers at the same time.

I'm using "robocopy /mir" over SMB which looks at the datestamps and filesize of the src and dest and only copies new files. I ran into the "slow folder listing" threads and already disabled the DOS bits and increased the vfs.zfs.arc_meta_limit but that was't any help. The thing that strikes me the most is that even when all four nodes are copying to freeNAS (at 60mbps), I can still copy an additional 200GB file at 350 MB/s which maxes out the read spead of the windows drive array.

Any advice on how to debug this?

Cheers,
Marum
 

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
Somebody around that can give me some directions to search in? I really need this server in production... :s
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
Have you looked at the type of data you're copying? if the first batch was lots of large files, your speed could have made sense, but if you're now into a section of files that are small and high in number, you will be spending much more time writing file metadata than actual file data, so copy speeds will drop like you have seen... there may be other causes too, but that's the first obvious one you can look at.
 

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
Thanks for the reply sretella. That would indeed make a lot of sense. Unfortunately ... it's the same folder/share I'm running the backup on.

The initial copy of share1 to the freenas box ran at speeds of 800 MBps. Now I rerun the same command for share1 and it only runs at 60MBps. The only difference is that there was no data to compare to in the initial copy, but there is for the 2nd backup. The file comparison is done by file size and timestamp.

I also notice that this goes fast for the first couple of seconds, but slows down tremendously after that. After a minute or so, it seems to proceed in a kind of bursted fashion. Usually this is a symptom of a bottleneck somewhere but I can't find what or where. io seems normal and smbd is only at ~20%. The arc hit ratio is about 90%. Ram is at a steady 4GB and 'wired' fills it up to 60GB as it should. Swap is free. No idea what to delve into. It might be samba but I don't know how to debug this.
 

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
I did the only test I could think of: create a massive folder with about 50k files on the freenas zfs and tried to view this folder from windows. If the culprit indeed is slow folder listing, well, this should be noticeable with a directory like that. Unfortunately/luckily the folders loads within a second..
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,555
Turn of atime, use rsync instead of robocopy and divide your data into manageable folders. What I think happens is that robocopy reads the data already written, ZFS updates the access times and this cost you bandwidth. The application I manage professionally vaults multi TB CAD data but we never put more then 100 GB in a single folder (or vault).
 

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
Wow, atime.. that would make so much sense. I just checked and it's turned 'on' on all file systems. Switched it to Off. I'll rerun the backup tonight and report back. Thanks Garm!
 

SaraNobi1

Cadet
Joined
Dec 14, 2019
Messages
4
marum I was has a problem like that and I wonder , Was your problem solved? as I have seen solutions and recommended programs
 

RogyMike2IT

Cadet
Joined
Dec 14, 2019
Messages
1
using rsync or gsrichcopy360 instead of robocopy will totally solve your problem , no need to consume more time as I think that you didn't find any solution till now with robocopy
 

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
Hi All,

A friend told me this was his first search engine result for "slow robocopy freenas". Thought I'd share my experience and report back.

TL;DR: Yes, I did get it to work at decent (not perfect) speeds. There were two tweaks: disable atime, and disable DOS attributes in samba. If you need DOS attributes, I have no solution.

Yesyes, "use a different program". I don't want to install rsync on all my servers. Windows is already unstable, I don't want to bring other software into the mix. The question is why. Why is rsync/gsrichcopy/... faster?

As far as I understand this (not an expert, just a techy such as yourself), Robocopy has to be backwards compatible with old and new Microsoft systems. This means it has to take care of DOS attributes and *nix needs a way to store these. Comparing these takes time. You can disable these DOS attribute in your SMB config editor in freeNAS. A more detailed explanation is provided in the link at the top. The title says it's about browsing performance, but you need this for robocopy file comparison. Trust me.

The second part is atime. Atime stands for access time. It keeps track of when a file was last accessed. You don't have to be a genius to figure out that this messes up your syncing speeds. It turns every read into a write, that obviously takes time. The link about this at the top takes you to the manual.

With these two tweaks I got my sync speed up to 300MB/s running a robocopy with /MT switch. Initial transfer was at 800MB/s, without these tweaks I got 60MB/s.

[I'm not sure the /MT option is a good idea, robocopy is mult-ithreaded for some reason. There are virtually no low-level copy tools out there that use multi-threading. The reasoning is that I/O never is, so why would the copy program support it. Following the same logic, Samba is still single threaded (rigth?)... Do your own tests, use whatever works best :) ]

Good luck!
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,544
DOS attributes are preserved if you use vfs_ixnas. They are written to as file attributes via a call to chflags(2). Performance impact of having them enabled in this case is basically zero.
 

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
@anados , thank you for pointing that out. I am aware that the article I linked to is about file browsing for freeBSD 9.x. I assume it is still relevant, correct? (Note that I have over 1k folders with 50k+ files in them)

About the syncing issues. The only algorithms that I can think of to compare folders, all need directory listing. It is not clear to me what kind of listing and details robocopy asks for over SMB protocol. This could be a plain file list. I am not sure if it can include attributes in one go, or that this requires multiple calls. I certainly have no idea what info robocopy gathers.

As I mentioned, atime did the magic for me. Apart from that, there seems to be something else going on. Others are facing the same issue and resorting to alternatives to robocopy. The only changes I made apart from atime is the dos attribute auxiliary settings. If you have any insight, please share.

I genuinely thought I figured this out, perhaps there are some settings that make my backups go faster still.

Kind regards,
M
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
449
make sure you have writes turned to asynchronous. The initial 800 megs transfer is ram caching....after that it's hardware to hardware.
 

marum

Dabbler
Joined
Aug 22, 2019
Messages
12
@hescominsoon, I actually have 800MB/s hardware transfer speed. (There is no ZFS ZIL)
What I meant is that copying the initial data to an empty folder reaches this speed, even with robocopy, even for some 100TB folders. Once the data is there and I run robocopy again to sync the changes, transfer speeds drop significantly. This due to file comparisons. Without mentioned adjustments it drops to 60MB/s at best. After adjustments it hits about 300MB/s, depending on the nature of the files. It runs in multi-threaded mode, so even if it is performing some large file transfers from multiple sources, there still is overhead looking for other files that do or do not need to be synced.
 
Top