Very poor throughput copying a large number of small files.

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
I'm pretty new to the whole Freenas thing, so maybe this is an expected behavior and I just need to learn to live with it, but I'm having an issue copying large numbers of small files. In the current situation, I am copying a directory that includes a few level of sub-directories, that, in total, contain 20,000+ small files that are 10KB to 50KB each. It's my Adobe Lightroom catalog... I'm copying FROM my Freenas to my PC, and what I'm seeing is that it will copy a bunch of files in a burst, then pretty much stop for a second, then resume. So far it has taken 30+ minutes to copy about 2.5GB of data.

Any advice will be appreciated - this catalog is just going to grow, and I need to copy it both ways on a regular basis.

The target on the PC is a M.2 NVMe drive, which is blazingly fast. My PC is a new build in the last month - AMD Ryzen 5 2600X, ASUS Crosshair VII Hero, 32GB of 3200MHz memory, 1GbE, 2xM.2 and 2xSSD drives. The PC shows less than 33% CPU and memory utilization, low disk and network use.

My Freenas is a Supermicro server with the following characteristics:
  • Supermicro SuperChassis 846E1-R1200B Dual 6 Core Xeon 24 x HDD Storage Server
  • Dual Intel Xeon E5-2620 15M 2Ghz
  • 128GB Ram (16x 8GB PC3-12800R)
  • SAS Controller: LSI 9207-8i
  • Supermicro Motherboard X9DRi-F
  • Backplane: BPN-SAS2-846EL1
  • Power Supplies: Dual PWS-1K21P-1R 1200W 80 Gold Plus
  • 16 x 3TB drives (14 used Seagate Constellation ES.3 SAS, 2 Toshiba SATA) in 2 Raidz2 Vdevs
  • 8 x 4TB drives (5 Seagate Constellation ES.3 SAS, 3 Seagate Ironwolf SATA) in 1 Radidz2 Vdev
All the reports show low utilization:
freenas 3.jpg freenas 4.jpg freenas 5.jpg freenas 6.jpg
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Do you need to copy the entire catalog each time?
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Oops - posted too quickly. Nothing with your setup jumps out to me as being a problem. If I were you, I would use Beyond Compare to sync back and forth. Your initial sync will take the longest, but subsequent syncs will not take nearly as long.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I have a system I support for work that uses a massive number of small files for a kind of database. There is nothing worse for performance than a mass quantity of small files.
I'm pretty new to the whole Freenas thing, so maybe this is an expected behavior
It isn't exactly unique to FreeNAS. I have seen the same type of thing with Windows and Linux. Each file requires a seek operation so it incurs some latency from the disk, mechanically.
I'm copying FROM my Freenas to my PC, and what I'm seeing is that it will copy a bunch of files in a burst, then pretty much stop for a second, then resume. So far it has taken 30+ minutes to copy about 2.5GB of data.
You probably want to do an iostat of the pool to see if any of the disks is slowing the pool down. You could have a slow disk that is killing pool performance. The starting and stopping is usually a sign of the cache flushing and filling, but what you are describing sounds a bit extreme.
 

John Doe

Guru
Joined
Aug 16, 2011
Messages
635
hardware of your FN machine looks quite impressive and should easily handle this.

I am not too sure if it will cause problems to mix sata and sas, but it would be a point for me to investigate.
Have your tried to copy a big file (big zip archive or an .iso)? In case the speed is also slow, there might be a problem with network/share etc.

another point, since you have such a serious set up, have you considered using ZIL/SLOG in combination with synchronous writes? apparently there is an advantage for small files.

Would it be another option to use SSDs for that small files?
 

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
Thanks, all.

In actuality I can probably set up some kind of synchronization/backup process to avoid copying the entire directory tree. Today's case was the initial copy of the catalog from my old PC (whose data is now out on the Freenas) to my new PC.

I'll look into Beyond Compare - in the past I've used DirSyncPro, but I never really trusted it to run without manual oversight.

In the past I only did occasional backups of my Lightroom data - one of my reasons to build the NAS was so I can set up a lot better protection for this data. I still have somewhere between 100K and 200K photos that are not yet in Lightroom (many still have to be scanned from negatives or slides), but my goal is to include them all, so my current 20K photo database will grow quickly. I know it sounds like a huge number of photos, but in the digital age, and with a camera that shoots 5 to 6 frames per second, at one sporting event I can easily capture 500 or 1,000 images.
 

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
I ran CrystalDiskMark on my system the other day. Here's the data for one of my SMB shares, which I think looks reasonable:

-----------------------------------------------------------------------
CrystalDiskMark 6.0.0 x64 (C) 2007-2017 hiyohiyo
Crystal Dew World : https://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

Sequential Read (Q= 32,T= 1) : 110.878 MB/s
Sequential Write (Q= 32,T= 1) : 111.887 MB/s
Random Read 4KiB (Q= 8,T= 8) : 96.937 MB/s [ 23666.3 IOPS]
Random Write 4KiB (Q= 8,T= 8) : 103.106 MB/s [ 25172.4 IOPS]
Random Read 4KiB (Q= 32,T= 1) : 71.756 MB/s [ 17518.6 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 97.320 MB/s [ 23759.8 IOPS]
Random Read 4KiB (Q= 1,T= 1) : 10.840 MB/s [ 2646.5 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 10.329 MB/s [ 2521.7 IOPS]

Test : 1024 MiB [W: 1.1% (205.0/18015.2 GiB)] (x5) [Interval=5 sec]
Date : 2019/04/01 14:28:48
OS : Windows 10 Professional [10.0 Build 18865] (x64)

Of course, the M.2 disk is FAST:
-----------------------------------------------------------------------
CrystalDiskMark 6.0.0 x64 (C) 2007-2017 hiyohiyo
Crystal Dew World : https://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

Sequential Read (Q= 32,T= 1) : 2857.089 MB/s
Sequential Write (Q= 32,T= 1) : 1500.961 MB/s
Random Read 4KiB (Q= 8,T= 8) : 534.028 MB/s [ 130377.9 IOPS]
Random Write 4KiB (Q= 8,T= 8) : 485.218 MB/s [ 118461.4 IOPS]
Random Read 4KiB (Q= 32,T= 1) : 449.812 MB/s [ 109817.4 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 395.856 MB/s [ 96644.5 IOPS]
Random Read 4KiB (Q= 1,T= 1) : 53.874 MB/s [ 13152.8 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 113.951 MB/s [ 27820.1 IOPS]

Test : 1024 MiB [H: 4.5% (21.6/476.8 GiB)] (x5) [Interval=5 sec]
Date : 2019/04/01 14:18:36
OS : Windows 10 Professional [10.0 Build 18865] (x64)
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
I'd look into a fast SLOG like the Intel Optane P4801x series. I have the 100GB module, it sits in a x4 slot and flies. If you want to go super safe, consider a dual, mirrored SLOG. Your board doesn't seem to have a set of NVME connectors, so I would consider a bifurcating card that turns one of your PCI 3.0x8 slots into two x4 slots for each NVME card.

Since you're using Lightroom, another thing to look into is a L2ARC drive (maybe a 1TB SSD) that can help with all the thumbnails. Not sure if thumbnails are considered MetaData (the demi-gods roaming the halls here will have to weigh in on that one) but I found a significant benefit with the combination of rsync and setting the L2ARC to metadata only. Granted, the L2ARC has to be re-populated every time you restart the server, but that shouldn't happen too often.
 

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
Here's iostat output. I am certainly no Unix guru, but it looks basically OK to me. There's no pattern of SAS/SATA performance impact I can see.

Code:
                             extended device statistics
     device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b
SAS  da0            1      39     33.1    214.6     6     0     9     1    0   4
SAS  da1            1      39     33.1    214.4     5     0     3     0    0   3
SAS  da2            1      39     33.1    214.2     5     0     2     0    0   3
SAS  da3            1      39     33.1    214.4     6     0    12     1    0   4
SAS  da4            1      39     33.1    214.1     5     0     3     0    0   3
SAS  da5            1      39     33.1    214.3     5     0     3     0    0   3
SAS  da6            1      39     33.2    215.6     6     0    10     1    0   4
SAS  da7            1      39     33.2    214.2     6     0    11     1    0   4
SATA da8            3      30     92.4    300.1     5     0    14     1    0   3
SATA da9            3      30     92.0    300.0     4     0    11     1    0   3
SAS  da10           3      24     96.7    300.8     5    12     1    10    0  27
SAS  da11           3      24     96.7    300.7     5    12     1    10    1  26
SAS  da12           3      25     96.7    300.9     5    12     1    11    1  27
SAS  da13           3      24     96.7    300.8     5    12     1    10    0  27
SAS  da14           3      24     96.6    300.8     5    12     1    10    1  26
SAS  da15           3      24     97.0    300.7     6    12     1    11    0  27
SATA da16           3      27     81.8    251.0     1     0     4     0    0   2
SATA da17           2      27     86.2    250.9     3     1    16     1    0   5
SATA da18           2      27     86.3    250.9     3     0    15     1    0   5
SAS  da19           3      22     84.6    251.3     2    13     5    11    0  25
SAS  da20           3      22     84.6    251.3     2    13     5    11    1  25
SAS  da21           2      22     86.5    251.4     3    13     4    11    0  25
SAS  da22           2      22     86.5    251.4     3    13     4    11    0  25
SAS  da23           2      22     86.5    251.3     3    13     4    11    0  24
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
Try the L2ARC first, 1TB SSDs are cheap.

Love your rig. It's incredibly inexpensive for what it is.
 

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
I'd look into a fast SLOG like the Intel Optane P4801x series. I have the 100GB module, it sits in a x4 slot and flies. If you want to go super safe, consider a dual, mirrored SLOG. Your board doesn't seem to have a set of NVME connectors, so I would consider a bifurcating card that turns one of your PCI 3.0x8 slots into two x4 slots for each NVME card.

Since you're using Lightroom, another thing to look into is a L2ARC drive (maybe a 1TB SSD) that can help with all the thumbnails. Not sure if thumbnails are considered MetaData (the demi-gods roaming the halls here will have to weigh in on that one) but I found a significant benefit with the combination of rsync and setting the L2ARC to metadata only. Granted, the L2ARC has to be re-populated every time you restart the server, but that shouldn't happen too often.
Thanks. I'll look into that. My plan was to use my second local M.2 drive (512GB) for my Outlook data file(s) and my Lightroom data and just back them up to the NAS nightly instead of working directly from the Freenas. The actual photos, however are on the server. Since both the Freenas and the PC are new to me (it was my old PC self-destructing that started me on this whole project) I have yet to prove that this is a workable solution.
 

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
Try the L2ARC first, 1TB SSDs are cheap.

Love your rig. It's incredibly inexpensive for what it is.
Thanks. I had been looking at off-the-shelf boxes for years, but could never justify the cost. Getting a ~32TB NAS for an all-in cost of $1,600, and a 1500VA UPS for $200, seemed like a deal I couldn't pass up. I'm still learning this Unix stuff, which I had avoided like the plague for a long time (my background is DEC PDP-11, VMS, Windows-NT (VMS ripoff - V->W, M->N, S->T, developed at MS by the former VMS architect, Dave Cutler), WIndowsXX). I haven't changed my opinion about Unix, but it's getting the job done, so far... Thanks to all for their guidance.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
Used Server gear is such a good deal. The only downside is the electricity bill, which may become substantial. FreeNAS has a steep learning curve but the community is amazing. Setting the L2ARC to metadata only has to be set from the CLI, IIRC.

Still remember working on a PDP-11/750 and marveling how the source tapes for Berkley UNIX could be worth $50k, if they were delivered intact to the Russian embassy. Playing frisbee with the disassembled removable 5MB HD platters (post-headcrash) required the use of gloves. Or how the school had to put in a dedicated transformer just for the mini-computer to support the 340MB hard drive spinning up (450VAC, 3ph, 50A)

Good times.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I know it sounds like a huge number of photos, but in the digital age, and with a camera that shoots 5 to 6 frames per second, at one sporting event I can easily capture 500 or 1,000 images.
Not so bad. I had a photography business for several years and it wasn't uncommon to come away from a full day wedding / reception with 1000 or more images. Light Room really helps with being able to go through them and make quick adjustments and small corrections before showing them to the client. I wish I had the computing capability I have now back then, it would have been so much faster.
 

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
Back in the days before there were such things as routers I used a bunch of 11/750s as routers for DECs internal manufacturing network. They were good little boxes. Of course, "little" is relative to the era - they were about the size of a washing machine, as were the hugely expensive, similarly-sized, "massive" 512MB "RP07" disk drives we used. That was a really fun time in my career...

I picked up a Tripp-Lite Smart UPS 1500 on eBay for a ridiculously cheap price ($124). It says I'm drawing an average of 400W for the server, a monitor, a 24 port switch, and a VoIP box. (So, 390+W for the server). Last month my cost of electricity was 22.5 cents /kWh (all-in, including taxes, etc.). So, it costs about less than $0.10/hr / $2.40/day to run the server, which seems worth the cost.
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,555
There might be something you can do to improve the performance, but you shouldn’t need to copy the catalog back and forth? I have my photos on the NAS and the catalog on my desktop. Every other shutdown or so of Lightroom it wants to backup the database. You can set it to backup more often if you wish. But the important part is you can set the NAS as your backup target, making sure you always have two copies of your database (and more if you also backup the NAS as you should)
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
Last month my cost of electricity was 22.5 cents /kWh (all-in, including taxes, etc.). So, it costs about less than $0.10/hr / $2.40/day to run the server, which seems worth the cost.
My cost for electricity is similar, but I have yet to do a power draw measurement. I was aiming for something below 100W. I'll do that as soon as I fit the new air shroud and adjust the fan speeds. Currently, they're running at full speed to deal with the SAS heat sink in particular. Those LSI 2116 sure like to run hot.
 

Andrew Ostrom

Explorer
Joined
Jul 28, 2017
Messages
57
There might be something you can do to improve the performance, but you shouldn’t need to copy the catalog back and forth? I have my photos on the NAS and the catalog on my desktop. Every other shutdown or so of Lightroom it wants to backup the database. You can set it to backup more often if you wish. But the important part is you can set the NAS as your backup target, making sure you always have two copies of your database (and more if you also backup the NAS as you should)
That's correct, and exactly how I set it up. It had been a while since I used Lightroom (I had been trying other software and gave up) - I have the catalog on my M.2 drive and it backs up to the FreeNAS. I had forgotten, also, that you don't need to back up the thumbnails since they will get recreated if you lose them. It was just that I had mounted my old PC drive on the FreeNAS, and then imported the disk. So, to get the catalog back on the local drive I just copied the whole directory tree, which included about 20,000 thumbnails. I'll never have to do that again, so I'm good.

I don't know about 100W, that seems like a very aggressive target. Since I've replaced about 100 halogen lights (mostly 90W PAR28LN, 100W A19 and 37W MR16s) with LED bulbs I figure it's just karma that I'm burning more for computing.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
Clearly, I need to measure it but I'm cautiously optimistic. The CPU has a TDP of 35W but idle power draw for the whole motherboard is allegedly around 31W. Active power for each drive is 7W, 5W idle. So it's still in the realm of possibility that the whole system idles at less than 100W. But there is no way to know until I hook up a power meter and start measuring.

But first, the air shroud has to go into place. That in turn will allow the SAS and CPU to get a lot more air flow from the Noctua Industrial fans I'm using, which in turn will allow me to run them at a lower than "blast" speed.
 
Last edited:
Joined
Jan 18, 2017
Messages
524
@Andrew Ostrom was your copy problem resolved? I had a similar problem when moving ten's of thousands of images around and it was pegging the SMB process on FreeNAS.
 
Top