Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Notes on Performance, Benchmarks and Cache.

anmnz

FreeNAS Guru
Joined
Feb 17, 2018
Messages
217
Thanks
117
Understood, my thought here was to try and understand the performance of the underlying disk hardware.
Well to point to one specific thing, the choice of "sync=always" will completely invalidate such a test. Do you realise that, among other problems, it will cause all data to be written twice?
 
Last edited:

KrisBee

FreeNAS Guru
Joined
Mar 20, 2017
Messages
896
Thanks
279
@wrayste ZFS was not designed to be the fastest filesystem per se, but using sync=always on a dataset in your dd test will have given a low value. In any case, the fio program is a far better way to benchmark your pool and you need to appreciate the difference in the way zfs handles sync and async writes (see for exmaple: https://jrs-s.net/2019/05/02/zfs-sync-async-zil-slog/ and https://www.ixsystems.com/blog/zfs-zil-and-slog-demystified/).

With just 4 drives I'd suggest you have only two realistic choices of pool layout: raidz2 or a stripe of mirrors. The later maximises IOPs, as explained here: https://www.ixsystems.com/blog/zfs-pool-performance-1/ & https://www.ixsystems.com/blog/zfs-pool-performance-2/ So Chris Moore's question re: proposed use such as iSCSI, SMB, or NFS for shares is very pertinent here.

Alongside the fio benchmarking program, the CLI tools gstat, zilstat and zpool iostat can be used to monitor disk/pool performance/activity.
 
Joined
Jun 4, 2019
Messages
5
Thanks
1
Well to point to one specific thing, the choice of "sync=always" will completely invalidate such a test. Do you realise that, among other problems, it will cause all data to be written twice?
I did not realise that, why does it cause a second write? I thought that setting ensured the disk write had completed before a sync request was acknowledged.

@KrisBee thanks, I will have a look at those this evening. I’m not expecting ZFS to be the fastest, the numbers I were getting were so far from what I was expecting was what caused me to query this.
 

anmnz

FreeNAS Guru
Joined
Feb 17, 2018
Messages
217
Thanks
117
I did not realise that, why does it cause a second write? I thought that setting ensured the disk write had completed before a sync request was acknowledged.
Sync writes ensure the data is written to disk before the write returns, yes, but ZFS does this by writing the data immediately to the "ZFS intent log" (ZIL). That data is then discarded from the ZIL once it's written again later through the normal async write process via a subsequent transaction group.

Look through this forum's resources for info on the ZIL for lots more information.

(The usual way to ameliorate the inevitable performance hit for sync writes is to move the ZIL to dedicated fast storage, where it is called a "separate intent log" (SLOG). The terms ZIL and SLOG are frequently confused.)

Another wrinkle with using sync writes for low-level disk performance testing is that the *application's* sync write call does not return until data is written to disk, which means that notification of the successful write has to travel all the way up the stack from the disk through all the layers of hardware and firmware and software back to the user-space application, before the thread that made the sync write call can issue another write. It's really hard to predict a priori what the impact of that is, but it doesn't seem like a great start if what you are actually trying to do is see how fast the disks can go.
 
Last edited:

Chris Moore

Super Moderator
Moderator
Joined
May 2, 2015
Messages
9,355
Thanks
2,994
The storage will eventually be a mixture of video, music, pdfs, etc.
For this use, which sounds like a very standard installation, I would suggest a SMB share with sync set to standard. Most applications do not call for sync writes and asynchronous is much faster as FreeNAS can utilize a RAM cache instead of needing to use the ZIL, as previously described by others. FreeNAS / ZFS uses RAM extensively for cache, which is why we suggest the use of ECC memory to ensure reliable function of the system. It is common for the memory to remain constantly around 97% utilized. I have servers at work that have 256GB of RAM and use it all, all the time. ZFS can be fast, if you give it enough resources.
I think this is a good image to illustrate that:
Screenshot from 2019-06-04 12-22-20.png
One thing to understand about ZFS pools (very generally speaking) a pool is made of one or more vdevs (virtual devices) and each virtual device behaves much like a single instance of the constituent disks that make the vdev. This can vary depending on the type of vdev and vdevs can be n-way mirrors, or some RAIDz (z1, z2, z3) which can have an impact, but the simple answer is, more vdevs provides more performance. If you have only one vdev, you are roughly limited to the performance of one drive. Also note that, with all vdevs being striped together (much like a RAID-0) I hope it is clear that redundancy (resilience) is at the vdev level. Loss of any vdev due to disk failure would result in total loss of the pool.

There is a lot of knowledge and experience in the forum members, please ask any questions you have and someone will certainly help you, even if it is just to say that it isn't a good idea because FreeNAS and ZFS is not always the answer to every question. ZFS was designed to be reliable, so to make it fast, that can be quite expensive. The question is, how fast does it need to be? Are you trying to do 10Gb networking or do you plan a 1Gb network and how many users will be accessing the system simultaneously?
 
Joined
Jun 4, 2019
Messages
5
Thanks
1
@wrayste ZFS was not designed to be the fastest filesystem per se, but using sync=always on a dataset in your dd test will have given a low value. In any case, the fio program is a far better way to benchmark your pool and you need to appreciate the difference in the way zfs handles sync and async writes (see for exmaple: https://jrs-s.net/2019/05/02/zfs-sync-async-zil-slog/ and https://www.ixsystems.com/blog/zfs-zil-and-slog-demystified/).

With just 4 drives I'd suggest you have only two realistic choices of pool layout: raidz2 or a stripe of mirrors. The later maximises IOPs, as explained here: https://www.ixsystems.com/blog/zfs-pool-performance-1/ & https://www.ixsystems.com/blog/zfs-pool-performance-2/ So Chris Moore's question re: proposed use such as iSCSI, SMB, or NFS for shares is very pertinent here.

Alongside the fio benchmarking program, the CLI tools gstat, zilstat and zpool iostat can be used to monitor disk/pool performance/activity.
Thanks, those links are really useful. I think some I may have glanced at but the mistake I made was assuming they were only relevant when using cache drives. I'll look at using fio to investigate further.

@anmnz Thanks for the extra information, with the above this is now clearer and I think the numbers seen are probably more realistic given what is actually going on (given those settings).

... I have servers at work that have 256GB of RAM and use it all, all the time. ZFS can be fast, if you give it enough resources.
...
The question is, how fast does it need to be? Are you trying to do 10Gb networking or do you plan a 1Gb network and how many users will be accessing the system simultaneously?
Yep, I understood the importance of RAM (and ECC) which lead me to go for a platform that supported RDIMMs, depending on price later I could add another 128 GB. The 10Gb is for future proofing at the moment as the alternative board with 4x 1Gb not being more attractive.

Thanks all for the help, as mentioned the critical information about the ZIL being written to the pool as well as the data then (when not using a cache device) is probably the critical information that I'd overlooked.
 
Joined
Apr 17, 2019
Messages
14
Thanks
1
Silly question, just trying to figure out what to throw at a new build im working on with 192GBs of RAM. From what I understood from this thread is that running the following would create 100Gig file to test read/write....

Code:
Write
dd if=/dev/zero of=tmp.dat bs=2048k count=50k
Read
dd if=tmp.dat of=/dev/null bs=2048k count=50k

Should I use this as well to test it on my system or change it to a larger number to account for the large RAM size?
 
Joined
Jun 4, 2019
Messages
5
Thanks
1
Silly question, just trying to figure out what to throw at a new build im working on with 192GBs of RAM. From what I understood from this thread is that running the following would create 100Gig file to test read/write....

Code:
Write
dd if=/dev/zero of=tmp.dat bs=2048k count=50k
Read
dd if=tmp.dat of=/dev/null bs=2048k count=50k

Should I use this as well to test it on my system or change it to a larger number to account for the large RAM size?
The simplest option I found was turn off compression and turn off asynchronous writes and then do the above. This should give you worse case performance for the drives.

Another option is to use the 'fio' tool which is a bit more in-depth and will possibly provide more accurate results.

With asynchronous writes you'll get much better performance and with 192 GB of RAM a lot of headroom, I wasn't so interested in RAM performance which is why I tested with the settings above.

Good luck with your testing, it is an art and the most important thing I learnt was about the double writes (described in some of the links above).
 
Top