Do I need a cache SSD?

Zak95

Dabbler
Joined
Mar 4, 2018
Messages
20
I have 6 x 2 drive mirrors in my pool (12 x 4TB SAS drives total) I added a 240GB SSD for SLOG. The server itself has 256GB RAM (Dell R720xd) I have a spare slot in the back for another SSD, is it worth adding a cache drive as well? Use case is VMs in home lab, I have 10GiB networking between my hosts and main PC. Also use rsync from QNAP NAS to FreeNAS, and some windows file shares. Also use Veeam for VM backups, the backup repository is an NFS share on my NAS, most of the VMs are on datastores from FreeNAS, but every night get copied to FreeNAS as a second backup.

Thanks.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
With that much RAM, you probably won't see much benefit from L2ARC. You can add it, have a look at your hit rate for a while and remove it later if it's doing nothing.

Your SLOG is mostly wasted as the maximum size for SLOG will only be 30GB in a worst-case situation due to the short timeout. If you added the right kind of SLOG, it can be helping, but just any old SSD will probably be hurting you more than helping (risk of data loss in a power cut).

I've seen a theory that partitioning the SLOG SSD to 30GB, leaving the rest empty allows for sector sparing in the free space, so that drive can live a very long service life.
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
SLOG helps with sync writes, and nothing else. You say "VMs" and "NFS", so this may actually be a decent use case here. The trouble with a SLOG is that unlike an L2ARC, corruption on a SLOG will corrupt data. There are some rather inexpensive Intel Optane cards available that plug into PCIe, which survive a power outage gracefully. I think they're around 30GB. A pair of those in mirror sounds like a better idea than a SATA SSD.

Are you seeing sync write performance loss from your VMs? You'd also want to make sure the datastore the VMs are on is no more than 50% full.

As for L2ARC, you don't have to guess. Use 'zfs-stats -a' and see what your ARC hit rate is. Then consider the size of your typical workload. If hit rate is low and workload would fit into L2ARC, you can benefit.

A single SATA SSD is a decent way to start dabbling with an L2ARC. You benefit from the size, and ZFS gracefully handles L2ARC corruption.
 

Zak95

Dabbler
Joined
Mar 4, 2018
Messages
20
Thanks for the replies. Here is some outpout from the zfs-stats, does it flag up anything obvious? The SLOG was a 240GB sandisk I had lying around, it only had about 30 hours on it. The cache drive would be an HP enterprise SSD, 480GB.

FreeBSD 11.3-RELEASE-p6 #0 r325575+d5b100edfcb(HEAD): Fri Feb 21 18:53:26 UTC 2020 root
2:29PM up 11 days, 17:19, 1 user, load averages: 0.51, 0.57, 0.44
------------------------------------------------------------------------
System Memory Statistics:
Physical Memory: 262054.47M
Kernel Memory: 3060.63M
DATA: 98.49% 3014.70M
TEXT: 1.50% 45.92M
------------------------------------------------------------------------
ZFS pool information:
Storage pool Version (spa): 5000
Filesystem Version (zpl): 5
------------------------------------------------------------------------
ARC Misc:
Deleted: 45876435
Recycle Misses: 0
Mutex Misses: 9562
Evict Skips: 9562

ARC Size:
Current Size (arcsize): 90.34% 229895.63M
Target Size (Adaptive, c): 90.32% 229852.90M
Min Size (Hard Limit, c_min): 12.50% 31808.03M
Max Size (High Water, c_max): ~8:1 254464.25M

ARC Size Breakdown:
Recently Used Cache Size (p): 89.61% 206019.21M
Frequently Used Cache Size (arcsize-p): 10.38% 23876.41M

ARC Hash Breakdown:
Elements Max: 5781948
Elements Current: 90.03% 5205929
Collisions: 12276076
Chain Max: 0
Chains: 364129

ARC Eviction Statistics:
Evicts Total: 5495296817664
Evicts Eligible for L2: 99.98% 5494320083456
Evicts Ineligible for L2: 0.01% 976734208
Evicts Cached to L2: 0

ARC Efficiency
Cache Access Total: 346859828
Cache Hit Ratio: 94.73% 328583983
Cache Miss Ratio: 5.26% 18275845
Actual Hit Ratio: 94.49% 327781251

Data Demand Efficiency: 94.97%
Data Prefetch Efficiency: 42.69%

CACHE HITS BY CACHE LIST:
Most Recently Used (mru): 17.52% 57585689
Most Frequently Used (mfu): 82.23% 270195562
MRU Ghost (mru_ghost): 0.00% 11937
MFU Ghost (mfu_ghost): 0.34% 1149211

CACHE HITS BY DATA TYPE:
Demand Data: 19.98% 65672192
Prefetch Data: 0.56% 1869690
Demand Metadata: 79.38% 260842621
Prefetch Metadata: 0.06% 199480

CACHE MISSES BY DATA TYPE:
Demand Data: 19.01% 3475976
Prefetch Data: 13.73% 2509428
Demand Metadata: 67.19% 12279625
Prefetch Metadata: 0.05% 10816
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
You have a cache hit ratio of 94%, there is no need for an L2ARC here.

As for your SLOG, a HP Enterprise SSD at 480GB is larger than you need. It'll work; it should protect you against power outage (caps), you need to weigh whether you are paranoid enough that you want it to be mirrored.

This is a decent intro to ZIL and SLOG: https://www.servethehome.com/what-is-the-zfs-zil-slog-and-what-makes-a-good-one/, and they make some recommendations: https://www.servethehome.com/buyers...as-servers/top-picks-freenas-zil-slog-drives/
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
My system is only a single-VDEV Z3 with 128MB of RAM. I ran some experiments re: performance improvements due to L2ARC being added and I found that it made a significant difference in my use case. Keep a couple of things in mind: in the current revision of FreeNAS (11.3), L2ARC is not persistent between reboots and will likely will require several passes to get "hot". Adding L2ARC likely carries no risk in your use case because your system has so much RAM (L2ARC requires some ARC RAM for indexing). I also exclusively use L2ARC for metadata to speed up directory browsing and rsync.

On the SLOG side, the articles Yorick referenced are good ones. I'd also look into the testing that the community here has done to see the real-life performance of various devices in that role. It's lengthy for sure, but it illustrates nicely how general-purpose SSDs may not fare well at all in that role, especially if they're the type that marries a fast cache up front with much slower flash in the back.
 

jenksdrummer

Patron
Joined
Jun 7, 2011
Messages
250
My use case is 100% lab/personal use, I've found SLOG to be pretty worthless - forcing on sync writes is a big performance hit even with SATA SSD. Perhaps with NVMe SSD it's not; but a proper NVMe + M.2 + 2280 size doesn't really exist; so I use my M.2 as cache; where I DO see a difference; as said, once the data warms up. Most of my use case is read as well; save for snapshots and replication of those snapshots for backups.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
My SLOG is slightly longer at 110mm. I’ve been quite happy with it.
 
Top