Why ZIL Size Matters (or Doesn’t)
Many applications require more writes than reads, so customers want to know how to tune ZFS for their requirements. I let them know that ZFS requires little tuning and TrueNAS requires even less. This is because TrueNAS uses TrueCache™ and places fast, non-volatile read and write caches in front of the disks.
Customers then say, “I use FreeNAS, not TrueNAS. What about me?” ZFS loves RAM and uses it for many things. It is used for read caching of the “hot data” set for your filer as well as metadata and L2ARC reference data, and other items. But increasing RAM is not the solution to improving write performance—use a ZFS “separate intent log” (SLOG) device instead!
TrueNAS uses a SLOG that has non-volatile RAM as a write-cache. It’s easier to refer to this as the ZIL, but that is not totally accurate. ZFS always has a ZIL or “ZFS Intent Log”. However if you don’t have a SLOG, it is stored on the hard disk and your writes get hard disk performance. Depending on your level of protection (RAID-Z1, RAID-Z2, etc), this may be slower than you expect. For example, if you use RAID-Z2, each write will be performed 6 times. You paid for 144 IOPS out of your hard disk, but you only get 24.
Now you know why TrueNAS uses a SLOG. It uses a SLOG to commit the writes when they hit the cache and not worry about the hard disk IOPS. This enables the application to think it’s been written to disk and continue with additional write operations. A question that often comes up is: How much space do I need? Do I need an expensive SLC SSD that stores hundreds of gigs for my SLOG or can I use a much smaller and cheaper SSD?
The answer is yes, you can. The SLOG is very specific in how it functions, so you only have to worry about one specific aspect of your IO: write performance. And to even take this a step further: synchronous writes. Now you’re saying: how in the world do I determine this?
There are some guidelines you can use if you are building a filer for hundreds or thousands of users or for an extremely busy database application, but I’m going to give guidance for the other 95%+ of you.
ZFS will take data written to the ZIL and write it to your pool every 5 seconds. Here is some simple throughput math using a 1Gb connection. The maximum throughput, ignoring overheads and assuming one direction, would be .125 Gigabytes per second. With 5 seconds between SLOG flushes and using a 1Gbit link with 100% synchronous writes, the most you will see written to your SLOG is 5 x .125 GB = .625 GB.
This shows that you don’t need that much space for a SLOG and can use a smaller SSD. If you have a write-intensive application that requires multiple 1Gb Ethernet connects or a 10Gb, you can increase the size proportionally.
So bringing it home—when choosing an SSD for a SLOG device, don’t worry about space. Choose an SSD device that has extremely low latency, a high write IOPS, and is reliable. iXsystems did so for TrueNAS and the FreeNAS Mini and so should you.