To SLOG or not to SLOG: How to best configure your ZFS Intent Log

SLOGblog2

by Mark VonFange

In the world of storage, caching can play a big role in improving performance.  OpenZFS offers some very powerful tools to improve read & write performance.  To improve read performance, ZFS utilizes system memory as an Adaptive Replacement Cache (ARC), which stores your file system’s most frequently and recently used data in your system memory. You can then add a Level 2 Adaptive Replacement Cache (L2ARC) to extend the ARC to a dedicated disk (or disks) to dramatically improve read speeds, effectively giving the user all-flash performance.

OpenZFS also includes something called the ZFS Intent Log (ZIL). The ZIL can be set up on a dedicated disk called a Separate Intent Log (SLOG) similar to the L2ARC, but it is not simply a performance boosting technology. This article aims to provide the information needed to understand what the ZIL does and how it works to help you determine when SLOG will help and how to optimize write performance in general.

Isn’t the ZIL just the ZFS name for a write cache?

Many people think of the ZFS Intent Log like they would a write cache. This causes some confusion in understanding how it works and how to best configure it. First of all, the ZIL is more accurately referred to as a “log” whose main purpose is actually for data integrity. It exists to keep track of in-progress, synchronous write operations so they can be completed or rolled back after a system crash or power failure. Standard caching generally utilizes system memory and data is lost in those scenarios. The ZIL prevents that.

Second, the ZIL does not handle asynchronous writes by default. Those simply go through system memory like they would on any standard caching system. This means that the ZIL only works out of the box in select use cases, like database storage or virtualization over NFS. OpenZFS does allow a workaround if you decide to opt for the extra level of data integrity in your asynchronous writes, by switching from “sync=standard” to “sync=always” mode, but that must be manually configured.

Third, the ZIL, in and of itself, does not improve performance. The ZIL sits in your existing data pool by default, usually comprised of spinning disks, to log synchronous writes before being periodically flushed to their final location in storage. This means that your synchronous writes are not only operating at the speed of your storage pool, but have to be written to pool twice, sometimes more depending on your level of disk redundancy.

How should you configure your ZIL?

As stated above, the ZIL’s primary purpose is to protect data in the case of a system crash or power failure and comes with performance penalties because it must be written to the ZIL before making it to your storage pool. What is needed for performance improvement is a dedicated SLOG, like a low-latency SSD or other similar device (ZeusRAM, etc), so your ZIL-based writes will not be limited by your pool IOPS or subject to RAID penalties you face with additional parity disk writes. And even with a dedicated SLOG, you will not enjoy performance improvements out of the box on asynchronous writes, as they do not utilize the ZIL by default.

To optimize your ZIL performance, the following things should be considered:

  • Use case: If your use case involves synchronous writes, utilizing a SLOG for your ZIL will provide benefit. Database applications, NFS environments, particularly for virtualization, as well as backups are known use cases with heavy synchronous writes.
  • Storage pool protection (RAID): When your ZIL is in-pool, you run a standard performance overhead of 2 writes + your write penalty for your RAID configuration, which comes to 4 writes total per transaction with RAID-Z1 (and mirroring), 6 with RAID-Z2, and 8 with RAID-Z3.  RAID-10 provides no additional performance penalty over raw disks.
  • “sync=standard” vs. “sync=always”: Asynchronous writes are not protected by the ZIL in the default “sync=standard” configuration under OpenZFS. If losing the couple seconds worth of write data in a power loss or system crash would be harmful to your operations, setting ZFS to “sync=always” will force all writes through the ZIL. This will make all your writes perform at the speed of the device your ZIL is set to, so you will want a dedicated SLOG under this configuration or writes will be painfully slow.
  • Choosing a SLOG device: OpenZFS aggregates your writes into “transaction groups” which are flushed to their final location periodically (every 5 seconds in FreeNAS & TrueNAS). This means that your SLOG device only needs to be able to store as much data as your system throughput can provide over those 5 seconds. Under a 1GB connection, this would be about 0.625GB. Correspondingly a 10GB connection would require 6.25GB and 4x10GB would require 25GB. This means latency, rather than size is your main consideration in choosing a device.
  • Performance requirements: If you have a use case that utilizes the ZIL, purchasing a dedicated SLOG device is a good way to improve performance. You can even use multiple SLOG devices, which OpenZFS will stripe across for improved performance. OpenZFS also allows for the SLOG to be mirrored, which can protect against performance degradation and avoid any data loss during a device failure. This means you can scale up your ZIL performance to handle high storage volumes with more availability for a relatively low cost.

Conclusion

OpenZFS provides powerful tools to give your FreeNAS & TrueNAS storage blazing performance with the cost of spinning disk storage. It allows you to add multiple levels of protection and disk redundancy to keep your data safe from corruption and loss. The ZFS Intent Log, or ZIL, is frequently discussed in vague terms that don’t provide a full picture of the benefits it provides or how to implement it properly. With the above information, you will have a better idea of how to get maximum performance with write protection for your storage environment.

Additional ZIL Related Resources

ZFS:

https://blogs.oracle.com/realneel/entry/the_zfs_intent_log

http://blog.delphix.com/ahl/2012/zfs-fundamentals-transaction-groups/

http://www.freenas.org/whats-new/2015/11/zfs-zil-and-slog-demystified.html

https://www.ixsystems.com/whats-new/2015/02/04/why-zil-size-matters-or-doesnt/

Choosing a SLOG device:

http://www.tomshardware.com/reviews/ssd-recommendation-benchmark,3269.html

http://ssd.userbenchmark.com/Explore/Fastest-SSD/8

http://ssds.specout.com/

http://www.ssdreview.com/?page=1

 

5 Comments

  1. Collin C. MacMilla

    “OpenZFS allows for the SLOG to be mirrored, which can improve your ZIL performance compared to a single SLOG.”

    How does mirroring the ZLOG increase ZIL performance over non-mirrored ZLOG? Doesn’t the commit to disk (the “second write”) happen from ARC? Wouldn’t a mirror only improve a read from disk that never really happens?

    It would seem that two or more ZLOG disk groups would improve ZLOG write performance since parallelism would be in favor of the write.

    Reply
    • Mark Vonfange

      Thanks for your comment, Collin. We’ve looked into things and updated that section to better explain how multiple SLOG devices can improve performance and guard against performance degradation.

      Reply
    • Eman

      “Wouldn’t a mirror only improve a read from disk that never really happens?”

      The cached data is written out to the pool in a certain intervale, meaning it needs to be read from the SLOG when it’s written out to the pool…

      Reply
      • David LeBlanc

        The SLOG is never read from unless there is a failure (power or crash). The purpose of a SLOG is to keep the IO transactions in non-volatile space until the transactions have been committed to the pool’s dataset. When the transactions are committed to disk they are read from the transaction group (TXG) stored in RAM. So, essentially, the SLOG is just a backup device in the event that the TXG is lost due to power failure or crash. Looking at the Disk IO in the reporting section of FreeNAS will illustrate that your SLOG device is only being written to.

        Mirroring the SLOG would only prevent your SLOG vdev from being accessible if one of the SLOG devices fail. In the event that the SLOG completely fails (single drive or both in a mirror), the ZIL will be written out to the pool, rather than the no longer accessible SLOG device. You will not see any performance improvement from a mirrored SLOG under normal conditions.

        A SLOG device can be striped though. This will increase performance. However, with each additional drive, you increase the chance of the SLOG failing over time. Your choices would be to add a stripe of mirrored SLOGs or to invest in a fast NVMe and replace SSDs. It depends on budget and drive bay availability.

        Reply
  2. Manny Lakis

    Great article, If we build and all SSD RAID-10 pool from MLC drives, would we benefit by installing a dedicated SLC SSD based ZIL drive – application is VMWare with NFS pools? This would reduce the wear on the MLC drives by not writing the DATA twice I assume, not sure though if this would provide any visible performance improvement ?

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

Privacy Preference Center

Essential

Government Compliance, Wordpress Login / Site Settings. Video rendering on main page. These cookies are required at minimum to use the website.

gdpr, wordpress, player, aka_debug 1, vuid

Performance and Analytics

Performance and Security. Analytics through Google Analytics, Marketo Marketing Integration, Lucky Orange,

_ga_, _gid_, _ga 2, _gat 2, _dc_gtm_UA-2174408-1 2 , __lotl, NID, OGPC, _mkto_trk,