FreeNas write speed troubleshoot

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
Hi there,

managed to get my system running:

box dl380e
cpu Xeon E5-2430L 2.0GHz (1x but there is room for 2)
ram 8GB RDIMM
nic 10Gbit/s Intel 2x (identical cards for NAS and client - worked out of the box without manually installing drivers)
interface type: sfp+ (OM3 fiber + FS Intel transceivers)

boot device: SeaGate HDD (temporary)
1 pool: striped/mirror, 8x 3TB SeaGate HDD

some (relevant?) images:








I'm wondering about the write speed.

I have to add that there are 2x SSD's on the way. These will become the boot devices (mirror)
And also 2x sticks of 16GB RDIMM

Is this enough information to troubleshoot the low'ish write speed?
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
thank you for the reply,

I'll see how it runs with the single CPU and 32GB... for a 10TiB pool I think I'm within recommendation spec. If after the SSD/RAM upgrade, I still don't mangage to max the speeds, I'll consider installing a second CPU and another 32GB of RAM

drives are: barracuda ST3000DM008 (all 8 of them identical). I know these aren't optimal drives, but I want to use them until they start to fail and gradually replace them with better and bigger drives.

EDIT: obviously I'm concerned about the read speed, I made a mistake in the title :)
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
BTW, have a look at "The path to success for block storage" if you haven't already.

11) Write speeds greater than read speeds?

When ZFS is "writing" to the pool, it is actually creating a transaction group in RAM to commit to disk later. If ZFS is "reading" from the pool and the data is not in ARC/L2ARC, it actually needs to go out to a HDD to pull the data in. If your read speeds are slower than your write speeds, it just means that the data being read wasn't in cache. If you expected that data to be in the working set, perhaps your working set is too small.

I'll see how it runs with the single CPU and 32GB
Correct me if I am wrong, but your CPU is capable of HT right? If so, I would suggest that you ensure you have that enabled in the BIOS.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
What CPU usage do you get while reading/writing?
Not sure you can see wheather a single core is used 100% or not in grah, you could try and disable cores and make it a quad dual and single core cpu and test the speed and see where and how you hit that ceiling.
E5-2430L is not a fast cpu for single thread applications.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
HT is enabled in bios, yup.

weird, after a reboot this happened:



that seems to make more sense, right?


EDIT:
What CPU usage do you get while reading/writing?

This is while the diskspeedtest is running:




at this point I can play back 100MB/s video in resolve (DNxHDX)

but DPX and EXR (file based, image sequence video) at around 200MB/s at first has some FPS drops.. but after Resolve played the clips once or twice, the fps is stable.

UHD DPX/EXR at 800MB/s is a no go though. But that makes sense.


E5-2430L is not a fast cpu for single thread applications

I could upgrade the CPU. Which one out of this list would you recommend?

 
Last edited:

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
Update:

so after some work, the read speed dropped again:



 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
It depends on what is limiting performance.
Going for one with higher clock should shove data through it a bit faster, so a v2 version of the same or non L but the L version does not seem to hinder much performance anyway for v2. If there is any benefit to more cores I would probably grab a higher core like 2450 or 2470 rather than having two cpu and risk fetching memory through the link, which I think those have. It depends on what you can get at a cheap price as none of them is worth that much in todays servers.
Check out passmark to get an idea what cpu model does to single thread score.

Check if you can set core count in bios just to test with.

Might be worth disable HT and test that as well since you are probably not thread limited by 6 cores atm.

Also do a scrub and check cpu and disk speed and see how much data that can read and compare that to over the network speed.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
It's very weird, I just gave my Windows 10 box a reboot and the speeds are back up again:



It depends on what is limiting performance.
Going for one with higher clock should shove data through it a bit faster, so a v2 version of the same or non L but the L version does not seem to hinder much performance anyway for v2. If there is any benefit to more cores I would probably grab a higher core like 2450 or 2470 rather than having two cpu and risk fetching memory through the link, which I think those have. It depends on what you can get at a cheap price as none of them is worth that much in todays servers.
Check out passmark to get an idea what cpu model does to single thread score.
Ok, good idea, I'll check that out.

Check if you can set core count in bios just to test with.

Might be worth disable HT and test that as well since you are probably not thread limited by 6 cores atm.
Probably makes sense yeah! Will try!

Also do a scrub and check cpu and disk speed and see how much data that can read and compare that to over the network speed.
Am doing that now, but I'm not sure where I can monitor the speed?
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
You shuold be able to see what each disk is reading at in reporting. So if disk one reads at 100MB when scrub but 50MB over the network you found some sort of loss there, while if it's the same then it could be cpu or disk simply not giving more performance. Disks should also pretty much show the same, so if there is one that is slower or have more pending IO or latency etc then that is a problem.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
You shuold be able to see what each disk is reading at in reporting. So if disk one reads at 100MB when scrub but 50MB over the network you found some sort of loss there, while if it's the same then it could be cpu or disk simply not giving more performance. Disks should also pretty much show the same, so if there is one that is slower or have more pending IO or latency etc then that is a problem.

okay so... the fasted disk is at 12000 kB/s read. So times 4 that is 48MB/s. Is my math horrible or is there problem?

Also, half of the disks are at around 12000, while the other half is around 8000 and some even lower. These disks are all identical and part of the same striped mirror.

Here's an image to show you the charts I'm reading


This is during a manually assigned scrub:



I was wondering, did I create the pool properly?

And I like the idea of the symmetry behind a striped mirror.. but perhaps it is overkill in my situation? It would be cool to achieve a little over 800MB/s read speeds as that would allow me to play back uncompressed/lossless UHDp25 footage.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
Do you have any data on it atm? Could you boot Hiren's boot cd on it (can do that from USB) and run HDTune and see if all drives make an equal graph and show good speed? https://www.hirensbootcd.org/
Should be read only but just in case don't have any important data on it.

What HBA do you run?

Since you are not writing all the time I would think a Z2 of 8 drives would be faster for reads, I'm sure there are many benchmarks and tests out there tho. If you are still still in test phase try both.

Depending on where it is in the scrub it might be hitting lots of random reads but 12MB/sec per disk looks wierd.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
ssd's and ram arrived and installed

installed freenas 11.3 RELEASE and this seems to give me more comprehensible disk transfer speed reports. When scrubbing (ssd + 32GB ram + 6 cores/HT) I get 780MB/s

CPU load at 18%, RAM load at 0.4% (services) and 28.2 (cache)

HBA: HP LSI 2380 (which has been flashed by the seller)

Am still in test phase yes, so I'd like to test!

Internally scrubbing, the disks seem to work fine, the speeds are equivalent, no large discrepancies.

But I have a problem with the pool.. over network, the speeds are down and when I exported/re-imported the pool into the new freenas OS, it places (ACL) behind some of the directories in the pool. Consequently, I cannot access these pools from the client system. Other folders are visible, I can read and write but I can't write copies of these hidden ACL folders into the home folder, it gives a permission error.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
couldn't be bothered troubleshooting so I tried a Z2:



EDIT:

I tried a 8 drive stripe and a different measuring device:



The read speed caps the network limit, so I suppose we're good on that front.

Theoretical max is 1360MB/s read with 8 drives striped.

The Davinci measuring tool seems to be designed for video data, specifically. But I'm wondering what kind of file format it uses. For example: an image sequence might require a very small allocation size unit on the HDD's. This would drastically speed things up. I've tested this with single drives a while ago and it works.

But I think all seems to be in order, no? Considering the measurements with crystaldiskmark?
 
Last edited:

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
Crystaldisk don't actually read that much data so you are probably just reading from RAM at that point. If you look at the network graph it's just a short segment regardless of size you choose (at least the versions I have run) Do a file copy of 30+GB and see what that looks like.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
I can select 32 and 64 GiB file sizes in Crystaldisk. Speeds remain the same! When I play a 800MB/s uhd dpx file sequence and watch the network speed monitor, I get 800 - 1000 MB/s speeds. File plays back smoothly after a couple of seconds. I'm not sure how much that has to do with cache

I can't do manual file copy tests because my client SSD's speed is a bottleneck.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
I mean it only tests for like 5 sec regardless of size.
But if you get the correct speed now then I suppose it's all good.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
Yeah I do believe I get the correct speed.. I mean, I'm playing back image files of 33MB, 25 of them per second = 825MB/s without issue. It only takes a few seconds for some caching (I assume) but after that it runs smoothly. The FreeNAS network adapter monitor shows stable speeds of around 6.70Gb/s = 838.75MB/s while the footage is playing.

Think I'll be settling on the Z2 configuration. I'll also be adding two hot spare drives to increase safety.

Managed to relocate the NAS to the basement. The HDDs are all running right below 20°C, the SSD mirror at 33°C.

Consider me a happy man!

Thanks for the help and the knowledge! And thank you to the creators of FreeNAS as well!

Next step is a UPS, I suppose!
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
IMO you only want hot spare of you need the uptime or cannot get to the server within reasonable time to swap a damaged drive. It's a good idea to have a drive laying around tho.
I'm sure that someone will chime in but a 9 drive z3 might be better than a 8 drive z2 with hot spare.
But regardless, safety comes from backups not redundancy. You could theoretically use those extra drives and store backup of files and then use them in the server in case one breaks. Just remember to read backup drives a few times a year so they have chance to repair any weakened data.
As I would like to say, redundancy is for uptime backup is for safety.
 

woods

Dabbler
Joined
Jul 27, 2018
Messages
45
The hot spare would be a fool proof method. I'm a total noob so I'd rather play it safe. But what happens exactly when a drive fails? Does the system shut down automatically, since you're talking about "uptime"?

I know, redundancy is not a backup. The intent of this machine is to have a fast video server. That's all it is supposed to be. It should receive video rushes and automatically upload these to google cloud. I preferred to build a dedicated standalone NAS rather than having an internal RAID in the client system because I switch OSs on the client system and I also think, overall it's a more suitable approach of handling so much data in general.

After a project is finished, I either offer the data to the production company/client and simply remove it from my drives. Only the final product is rendered in high quality and backed up on google cloud (cold storage) as well as a double copy on DVD. One set of DVD's is stored at my place. The other copy is stored at my folks'. So that's 3 copies existing in at least 3 different places.

During production it is impractical to backup more than besides the google cloud. It would mean I'd have to get a second server running at a different location. I'm not a post production facility, so this is kind of out of my league. In any case, if my system would fail and the data is gone, I can rely on the google backup. On high end shoots, there is already a physical copy of the data created on set. For little one-man gigs, I can afford to have "just" one backup in the cloud. On high end productions, I'm not responsible for the data though. But I always make my own copies, just in case someone else fucks up.
 
Top