To SLOG or not to SLOG: How to best configure your ZFS Intent Log
In the world of storage, caching can play a big role in improving performance. OpenZFS offers some very powerful tools to improve read & write performance. To improve read performance, ZFS utilizes system memory as an Adaptive Replacement Cache (ARC), which stores your file system’s most frequently and recently used data in your system memory. You can then add a Level 2 Adaptive Replacement Cache (L2ARC) to extend the ARC to a dedicated disk (or disks) to dramatically improve read speeds, effectively giving the user all-flash performance.
OpenZFS also includes something called the ZFS Intent Log (ZIL). The ZIL can be set up on a dedicated disk called a Separate Intent Log (SLOG) similar to the L2ARC, but it is not simply a performance boosting technology. This article aims to provide the information needed to understand what the ZIL does and how it works to help you determine when SLOG will help and how to optimize write performance in general.
Isn’t the ZIL just the ZFS name for a write cache?
Many people think of the ZFS Intent Log like they would a write cache. This causes some confusion in understanding how it works and how to best configure it. First of all, the ZIL is more accurately referred to as a “log” whose main purpose is actually for data integrity. It exists to keep track of in-progress, synchronous write operations so they can be completed or rolled back after a system crash or power failure. Standard caching generally utilizes system memory and data is lost in those scenarios. The ZIL prevents that.
Second, the ZIL does not handle asynchronous writes by default. Those simply go through system memory like they would on any standard caching system. This means that the ZIL only works out of the box in select use cases, like database storage or virtualization over NFS. OpenZFS does allow a workaround if you decide to opt for the extra level of data integrity in your asynchronous writes, by switching from “sync=standard” to “sync=always” mode, but that must be manually configured.
Third, the ZIL, in and of itself, does not improve performance. The ZIL sits in your existing data pool by default, usually comprised of spinning disks, to log synchronous writes before being periodically flushed to their final location in storage. This means that your synchronous writes are not only operating at the speed of your storage pool, but have to be written to pool twice, sometimes more depending on your level of disk redundancy.
How should you configure your ZIL?
As stated above, the ZIL’s primary purpose is to protect data in the case of a system crash or power failure and comes with performance penalties because it must be written to the ZIL before making it to your storage pool. What is needed for performance improvement is a dedicated SLOG, like a low-latency SSD or other similar device (ZeusRAM, etc), so your ZIL-based writes will not be limited by your pool IOPS or subject to RAID penalties you face with additional parity disk writes. And even with a dedicated SLOG, you will not enjoy performance improvements out of the box on asynchronous writes, as they do not utilize the ZIL by default.
To optimize your ZIL performance, the following things should be considered:
- Use case: If your use case involves synchronous writes, utilizing a SLOG for your ZIL will provide benefit. Database applications, NFS environments, particularly for virtualization, as well as backups are known use cases with heavy synchronous writes.
- Storage pool protection (RAID): When your ZIL is in-pool, you run a standard performance overhead of 2 writes + your write penalty for your RAID configuration, which comes to 4 writes total per transaction with RAID-Z1 (and mirroring), 6 with RAID-Z2, and 8 with RAID-Z3. RAID-10 provides no additional performance penalty over raw disks.
- “sync=standard” vs. “sync=always”: Asynchronous writes are not protected by the ZIL in the default “sync=standard” configuration under OpenZFS. If losing the couple seconds worth of write data in a power loss or system crash would be harmful to your operations, setting ZFS to “sync=always” will force all writes through the ZIL. This will make all your writes perform at the speed of the device your ZIL is set to, so you will want a dedicated SLOG under this configuration or writes will be painfully slow.
- Choosing a SLOG device: OpenZFS aggregates your writes into “transaction groups” which are flushed to their final location periodically (every 5 seconds in FreeNAS & TrueNAS). This means that your SLOG device only needs to be able to store as much data as your system throughput can provide over those 5 seconds. Under a 1GB connection, this would be about 0.625GB. Correspondingly a 10GB connection would require 6.25GB and 4x10GB would require 25GB. This means latency, rather than size is your main consideration in choosing a device.
- Performance requirements: If you have a use case that utilizes the ZIL, purchasing a dedicated SLOG device is a good way to improve performance. You can even use multiple SLOG devices, which OpenZFS will stripe across for improved performance. OpenZFS also allows for the SLOG to be mirrored, which can protect against performance degradation and avoid any data loss during a device failure. This means you can scale up your ZIL performance to handle high storage volumes with more availability for a relatively low cost.
OpenZFS provides powerful tools to give your FreeNAS & TrueNAS storage blazing performance with the cost of spinning disk storage. It allows you to add multiple levels of protection and disk redundancy to keep your data safe from corruption and loss. The ZFS Intent Log, or ZIL, is frequently discussed in vague terms that don’t provide a full picture of the benefits it provides or how to implement it properly. With the above information, you will have a better idea of how to get maximum performance with write protection for your storage environment.
Additional ZIL Related Resources
Choosing a SLOG device: