Slow transfer speeds with many small files

Status
Not open for further replies.
Joined
Feb 15, 2013
Messages
5
I've recently setup freenas on an old dell computer. When transferring large files I get an avg transfer is pretty good with a rate of around 50MB/s. Then I tried to transfer a folder that contained around 80K small files (around 1-100KB each) and I'm now getting transfer rates around 200KB/s. Does anyone have any info about why there is such a dramatic loss of write speed? Here are my specs:

Hardware:
2GB ram
2x2TB Seagate Barracuda drives 7200rpm
1x320GB drive (this is where freenas is installed on. I know it's best to use a usb stick, but I didn't have one kicking around).
Intel(R) Core(TM)2 CPU 6420 @ 2.13GHz
NX1101 Gigabit Network card

Software:
FreeNAS-8.3.0-RELEASE-p1-x64 (r12825)

The 2x2TB drives are mirrored and formatted using ZFS. I'm connecting to the NAS via my mac using afp.
 

ben

FreeNAS GUI Developer
Joined
May 24, 2011
Messages
373
How full is that volume, pray tell?
 
Joined
Feb 15, 2013
Messages
5
The 2TB drive is split into 2 datasets. One is 1TB and is used for timemachine backups and is 60% full. The other is 800GB and is currently 2% full. The 800GB volume was the one I was transferring the large amount of small files to. At the time of transfer there was no other data being sent to the drives (via time machine or other computers) since I'm the only user on my network.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
There's 2 reason why speed suffers... here's why: (I'll assume copying a file to the server for my examples, but going the other way has the same limitations)

When copying a file the sharing protocol(as far as I know all of them work this way) requires extra communication for each file. Here's basically what the protocol looks like. Each step won't occur until the previous step is complete.

1. Workstation -> Server : "I want a new file called document.doc"
2. Server -> Workstation : "Your file is created.. send me any contents you want" (The server verifies appropriate permissions and creates the file)
3. Workstation -> Server : File contents sent. (The server does send acknowledgement packets as the file transfers too)
4. Workstation -> Server : "File contents completely sent. Please close the file"
5. Server -> Workstation : "File is closed and saved to the hard drive" (The server MUST complete all applicable writes from the write cache before sending this command. This is often called a sync write.)*
6. Workstation will then delete the file locally or begin with #1 if only copying.

Overall, at each step you are waiting for the other machine to complete their step.

Now, also consider the fact that as each read or write is performed, if you are using magnetic media you will also be adding in seek times for the hard disk heads. Typically, if you have a seek time of 5ms you can expect an absolute maximum of 200 writes per second without any consideration of any other latency. Now consider that you are also executing 3 independent writes for each file... the file system creating the file, the file contents being written, and the file system closing the file. So now, you are really talking about a maximum of, at best, about 65 files per second. Add in the latency of the protocol plus the fact that the workstation will also have it's own 5ms of seek times and you can easily find yourself with only 10-30 files per second. If each of those files is only a few kb, then you can expect only 10-30 times the file size for total transfer rate.

There's no easy way to beat this. Even when one side has an SSD(thereby removing the 5ms penalty for seeks) the other side will still be quite busy trying to complete those seeks.

* - If the amount of data that needs to be written is less than 64 kilobytes and you have a ZIL for the zpool(preferably on an SSD) then the sync write will happen to the ZIL instead of your pool and you will save a small amount of time. However, your ZIL will still have to be process at some point in the future(typically within 2-15 seconds) so your zpool will still end up busy at some point in the very near future. This could potentially slow down future writes too. There is a way to disable ZIL sync writes without a ZIL, but that is extremely dangerous for your data and can result in corruption of the zpool and loss of data. I don't remember the exact tweak to disable sync writes, but if I did know it I don't know if I'd share it because it is almost irresponsible to use it in most situations. For small file transfers it won't significantly help anyway.

Hopefully this all makes sense. :P
 
Joined
Feb 15, 2013
Messages
5
This is a really good explanation! Thanks so much for the detail. This helps me to understand a bit more of what's going on. I suppose the slow write times are not a huge deal since it's just backing up data. In theory if macs supported ZFS natively and you could mount a ZFS volume (eliminating the overhead of the sharing protocol), would this dramatically improve performance or would the improvement only be small?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
The improvement should be negligible. Your server normally wouldn't be so loaded and network latency so high that it would affect performance much. Something like 90% of the time spent doing nothing is for the hard drive seek time.
 

squeedle

Cadet
Joined
Apr 30, 2013
Messages
3
Really appreciate the explanation, cyberjock. I experienced the same issues of slow transfer speeds on lots of small files. A potential workaround seems to be archiving old or rarely used small files into zip files. But then I guess you risk losing more data if one of those zip files gets corrupted...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
You could make a backup server that handles "archives" as well as backups of your primary system. Then you can kill 2 birds with 1 stone.
 
Status
Not open for further replies.
Top