Write Speed Issues

Doug183

Dabbler
Joined
Sep 18, 2012
Messages
47
I have just recently switched from NAS4Free to FreeNAS and have noticed that my pools write speeds (4 drive RAIDZ in this specific instance) are dramatically different between the two operating systems. ~430MBs writes in NAS4free vs ~270MBs in FreeNAS. Hardware is identical in both setups, I simply created a new FreeNAS SSD boot drive and booted it. I also have a second system that I switched to FreeNAS and am seeing the same issue there as well. (Though not as thoroughly as tested but definitely slower pool write speeds.)

NOTE: I want to stay with FreeNAS and realize this post my get inflamed. I request that responses push for a technical answer here so I keep using FreeNAS. I do need the fast write speeds for these pools because I use these drives to capture video and images from a high speed film scanner. When the pools slow down, the scanner automatically slows as well. This is actually how I discovered the issue.

Hardware:
Xeon E3-1230 v2 3.3 Ghz
Super Micro 9XSCM
32 Gigs ECC RAM
2x LSI 9201-16i - FW:20.00.01.00 - IT. >>> 30 Drives direct attached to these cards.
1x LSI 9207 8E - FW:20.00.01.00 - IT >>>
1 External Sas cable to >>> INTEL RESCV360JBD Expander >> 20 Drives & 10 Open Slots.
1 External Sas cable to >>> External 4 drive bay box

Software
NAS4Free: 11.0.0.4 Sayyadina (4283)
FreeNAS: 11.2-U4.1

RAID in question:
- 6 TB usable RAIDZ (4x2TB)
- No compression in either OS.
- No Encryption
- No ZIL or l2arc (on any pool)


TESTS and NOTES:
- To isolate the drives and remove network, SSH into the system and ran a DD test (64GB file - I am dealing with large video files) - dd if=/dev/zero of=tempfile bs=1M count=64000. Images are from this test. Results are consistently reproducible.
- I tried various "sync" options in FreeNAS on the pools using the GUI, but it didn't seem to effect test results.

NAS4Free.png
FreeNAS.png
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Couple of thoughts:
  • I wonder how much faster your transfer speeds would be if you chose to use a larger, multi-dev pool instead of a single 4-drive vdev pool
  • How loaded is the CPU trying to process each test? (i.e. are all the checksums holding you back)
  • Have you tried to turn on async writes just to see what the raw hardware speed is without all the protections that FreeNAS provides when sync writes are turned on?
  • I wonder to what extent a fast SLOG may help you with write speeds. However, intuition suggests a bigger pool with 2+ vdevs would likely do a lot more to help than a SLOG in this use case (assuming the CPU/RAM is sufficient)
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I request that responses push for a technical answer here so I keep using FreeNAS. I do need the fast write speeds for these pools because I use these drives to capture video and images from a high speed film scanner. When the pools slow down, the scanner automatically slows as well. This is actually how I discovered the issue.
I can't give you an answer why there is a difference between NAS4Free and FreeNAS except maybe to do the test using "random" vice "zero" for the "dd" tests. I know you said that compression is turned off but I don't know how NAS4Free works and if there is any compression actually running that you are unaware of. Using "random" should negate most of any compression if it exists. All this will do is put your dd test on a level playing field.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Other ways to improve your write speed, more RAM, meaning lots more. The SLOG may help too. But the best way is to rebuild the pool as multiple vdevs as mirrors, not RAIDZ.
 

Doug183

Dabbler
Joined
Sep 18, 2012
Messages
47
Other ways to improve your write speed, more RAM, meaning lots more. The SLOG may help too. But the best way is to rebuild the pool as multiple vdevs as mirrors, not RAIDZ.
Thanks. I will run the random tests, and test my large 10 disk RaidZ2 pool speeds in both OSs, and test some other ideas I have - like
1) is there some 4K/512 offset I don't understand.
2) Is the OS that created the pool making the difference.
3) Blank drives vs fragmented drives.
4) Optimal drive sets (3 in case of RaidZ vs my 4 drive pool)

There is something I am doing in my work flow that is triggering this issue. I know it because I switched the film scanner capture pool back to NAS4free, and write speeds came right back up to 350MBs. Also I didn't mention during the capture sessions wile using FreeNAS, I was capturing tens of thousands of large still images (65MB each) in real time and this brought the capture pool to a crawl. (This is different from capturing one large file that I normally do.). Regardless, switching back to NAS4Free and I was up and running again with no slow down problem. Now it could be a network setting that I am missing in the 10gbe network which is why I didn't mention it before. But something is quite different in the two OSs. I am sure I can make it go faster with more drives and vdevs, but I want to see if we can solve the setting I am missing (or dare I say bug) or maybe its a safety feature used in FreeNAS that isn't there in NAS4free.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
It very well could be the NIC driver between the two OS's. First I'd do a search for that NIC model and FreeNAS in Google to seee if anyone else is using that model and see if any postings will help you out. If that doesn't work I'm thinking that you should start a new thread if the new dd tests don't show any difference form previous testing and title the new thread "NIC (model) runs slow in FreeNAS" or similar and provide all the hardware details and ask for help. Provide the entire 10Gbe hardware path from NAS to Computer/Scanner as well.

Good Luck!
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Do check to make sure that the 10Gbe is actually running at the advertised speeds. I had a bear of a time with iXsystems' private-labeled Chelsio card playing nice with a LAGG failover. So much so that I abandoned the failover altogether. On top of that, I've occasionally also ran into config issues killing the connection altogether, requiring a switch reboot to fix. Might be Mikrotik, might be FreeNAS.

Now that I have new hardware, I may try again.
 

Doug183

Dabbler
Joined
Sep 18, 2012
Messages
47
Do check to make sure that the 10Gbe is actually running at the advertised speeds. I had a bear of a time with iXsystems' private-labeled Chelsio card playing nice with a LAGG failover. So much so that I abandoned the failover altogether. On top of that, I've occasionally also ran into config issues killing the connection altogether, requiring a switch reboot to fix. Might be Mikrotik, might be FreeNAS.

Now that I have new hardware, I may try again.
Sure. But let's shelve the NIC thread for now, only because the DD tests indicate the same issue AND eliminate the NIC involvement all together. However, I will be happy to have that be the actual issue.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Sure. But let's shelve the NIC thread for now, only because the DD tests indicate the same issue AND eliminate the NIC involvement all together. However, I will be happy to have that be the actual issue.
Yup, I said it backwards in my response, kind-of. But you understood what I meant.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Using dev random is not going to help. That will just stress the CPU.

Is this the same pool being imported under each OS? Or are you rebuilding it each time. I would be interested in testing a new pool under each os if possible? Also you are going to want as many vdevs as possible for your workflow. Mirrors would probably be best but that's a different conversation.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Using dev random is not going to help. That will just stress the CPU.
It won't help? I'm looking at possible compression issues. The user could just transfer a large mpg file instead of creating a new one.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
It won't help? I'm looking at possible compression issues. The user could just transfer a large mpg file instead of creating a new one.
Nope, at that point your bottleneck will be the CPU not the disk speeds. Random is super slow, that's why urandom exists and even that will be slower than your disks.

Compression probably is not coming into play here. The performance they are seeing is not that crazy fast number that is clearly because nothing is getting written. Their numbers are completely reasonable. It could be record size, it could be queue depth.

You had also mentioned that more ran would help. That is incorrect also. Writes get flushed to disk from memory pretty quickly and do not take up much space. Most of ram in a zfs system is read catch. I also don't think the intent log is having any affect here but the user should set sync=disable just to be sure.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
IIIRC, issues with compression creating inconsistent write results is the reason to use random data? Random stuff ensures that the CPU, the drives, etc. all get taxed equally, regardless of the OS / program running on top of it all (FreeNAS, NAS4Free, etc).

If the CPU is up to snuff, the next suspect would be the Pool configuration. I'd add VDEVs to the pool (as Z2 or mirrored sets) and compare performance. But I'd keep an eye on the CPU throughout testing to ensure that it doesn't quietly become the bottle-neck once the VDEVs in a pool multiply.

Your system has a lot of disks available and I'd consider committing them to 1-2 pools that are filled with VDEVs that best meet your needs. For example, 6-drive Z2 VDevs may provide a good balance of data a security and speed (assuming the CPU / HBA / etc. can handle it). Then slice and dice the pool as needed in FreeNAS re: permissions, quotas, mount points, and so on.

At least that's what I'd do for your use case as I understand it. Your actual use case may be different, however! My pool putters along with only one VDEV in a Z3 config. But that's by design, as the electrical costs here are pretty outrageous and this is primarily a backup server.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Warning - Backhanded and condescending comments will not be tolerated.
IIIRC, issues with compression creating inconsistent write results is the reason to use random data? Random stuff ensures that the CPU, the drives, etc. all get taxed equally, regardless of the OS / program running on top of it all (FreeNAS, NAS4Free, etc).
I don't believe /dev/random is the correct way to do a dd test. Results can be very inaccurate.
 
Last edited by a moderator:

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
In the spirit of continuous improvement... exactly where did I advocate for the use of /dev/random? I simply advocated for the use of random data to avoid issues with compression (that may or may not be turned on by default in either system). The test data could consist of a large photo/video collection, for example (if that is what the OP is generating with his scanner).

Granted, many folk here apparently associate the term "random data" with the use of /dev/random; but I don't. Tests using /dev/zero with compression turned off certainly can create a "level playing field" for quick comparisons. But what's wrong with using test data and following a workflow that is as representative of the actual use case as possible? Cheers.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
exactly where did I advocate for the use of /dev/random?
You didn't, it was pointed at me. Although I don't think I agree, at least based on my testing done several years ago, but it's okay to disagree. But for testing in your case, it's not about how long it takes to create a random file, it's how long it takes to copy a large random file. You could use any large file as "if =" to test with.
 
Top