dairyengguy
Dabbler
- Joined
- Jul 17, 2015
- Messages
- 28
I was trying to find out how to size a SLOG, and I came across this post. I had been planning on pricing a SLOG so that I can enable sync and maintain performance on my test install. It raised a few questions for me.
From here: https://pthree.org/2013/04/19/zfs-administration-appendix-a-visualizing-the-zfs-intent-log/
I was wondering about this, as I have seen threads on this forum suggesting that we should avoid async due to potential corruption issues. However, it would seem contrary to this blog. Can anyone explain what I am missing here?
Asynchronous Writes
Asynchronous writes have a history of being "unstable". You have been taught that you should avoid asynchronous writes, and if you decide to go down that path, you should prepare for corrupted data in the event of a failure. For most filesystems, there is good counsel there. However, with ZFS, it's a nothing to be afraid of. Because of the architectural design of ZFS, all data is committed to disk in transaction groups. Further, the transactions are atomic, meaning you get it all, or you get none. You never get partial writes. This is true with asynchronous writes. So, your data is ALWAYS consistent on disk- even with asynchronous writes.
So, if that's the case, then what exactly is going on? Well, there actually resides a ZIL in RAM when you enable "sync=disabled" on your dataset. As is standard with the previous synchronous architectures, the data blocks of the application are sent to a ZIL located in RAM. As soon as the data is in the ZIL, RAM acknowledges the write, and then flushes the data do disk, as would be standard with synchronous data.
I know what you're thinking: "Now wait a minute! The are no acknowledgements with asynchronous writes!" Not always true. With ZFS, there is most certainly an acknowledgement, it's just one coming from very, very fast and extremely low latent volatile storage. The ACK is near instantaneous. Should there be a crash or some other failure that causes RAM to lose power, and the write was not saved to non-volatile storage, then the write is lost. However, all this means is you lost new data, and you're stuck with old but consistent data. Remember, with ZFS, data is committed in atomic transactions.
From here: https://pthree.org/2013/04/19/zfs-administration-appendix-a-visualizing-the-zfs-intent-log/
I was wondering about this, as I have seen threads on this forum suggesting that we should avoid async due to potential corruption issues. However, it would seem contrary to this blog. Can anyone explain what I am missing here?