I'm trying to decide how to move forward with my ZFS setup (ZFS on linux, hopefully that doesn't make this off topic posting here... my questions are all high level hardware choice questions and strategy though.)
Thanks!
- Already have drives, had them for months, just been bogged down by research rabbit holes. One thing I'm realizing is that I ought to really go ahead and spin up the pool and load data in it -- if I want to leverage a special metadata vdev I can add it later, do a series of file moves within the same pool to rewrite the data, and that should take care of distributing the data. That will help get me out of my procrastinated state on this project. I'd been planning to do a carefully orchestrated scheme of setting up a degraded RAIDZ2 from the start with one missing disk, so that I can buy one more disk at a later time to bring online the additional redundancy. This scheme does require a bit of planning ahead, but I've been taking the planning ahead a bit far and went down the research rabbit hole.
- I'm really curious about how much I could push the envelope when it comes to metadata access speed. I don't really have a hard use case, but it's some kind of intrinsically satisfying thing to be able to query a large quantity of metadata quickly.
- What I currently do for my data is make a call to periodically (manually, on demand), which will take maybe a minute to run (though should be longer these days as I've set up Time Machine for my macbook to sync to this pool now...) and then I can do realtime fuzzy search of my entire data with no latency using a tool like FZF: https://github.com/junegunn/fzf This wonderful piece of software can let you search through gigabytes of text content with ease. Although this is actually a strong argument for not needing fast metadata access, it really still makes me really really wonder what I could set up to get the fastest possible speed of fetching the metadata.Code:
find /pool -type f -printf "%M %s %t %p\\n" > ~/find_zfs
- What I currently do for my data is make a call to
- I understand well that metadata access with ZFS can be sped up through many methods. If it fits in system RAM, then it could be served by ARC, I have 96GB of RAM in the system and for one reason or another it isn't being cached too well by ARC when I'm dumping the entire file listing with the find command. Probably because I don't have *that* much RAM.
- L2ARC can accelerate metadata access. It cannot accelerate metadata write, but metadata write isn't really ever gonna be a bottleneck for me.
- Using a Special vdev for metadata can accelerate access in a guaranteed way. It's a commitment requiring mirrored devices, and
- My question is, will optane offer any advantage over NVMe when used as special vdev (in a mirror of course)? It seems clear that it has advantages for SLOG but I'm not terribly interested in SLOG performance because I don't mess around with networked VM storage and I take care not to litter filesystems with tiny files in any software that I build (as well as avoid where possible using software that does that kind of thing).
- I also had a separate question, which is: I've read about Optane being beefy enough to host L2ARC and SLOG together, or SLOG and special vdev together (I think I read that here), but what about all three (SLOG, special metadata vdev, and L2ARC)? How insane would that actually be. I know this question sort of contradicts my earlier statement that I don't care about small file write performance. But like I'll gladly resort to using it as justification for buying more gadgets I don't need (in this case optane storage devices).
Thanks!