BUILD Intel Optane SSD P1600X SSDPEK1A118GA01(118GB variety) -- best uses?

CookieMonster

Dabbler
Joined
May 26, 2022
Messages
34
What could be the possible use cases for:

Intel Optane SSD P1600X SSDPEK1A118GA01​

(118GB variety)


I snagged 3 pieces for my server build while they were in sale with the idea that I can return them until January of I don't end up needing it.

- ZFS has different types of caches. Which ones would benefit from this?
- Do they need to be ran in pairs/triplets?
- Would it help to place OS on this drive for low latency benefit, or is OS fully in RAM after boot anyway?
- Would using them for VMs help?

ALSO: Do these play nicely with AMD (Ryzen) platform? I would imagine it should be platform-agnostic because it's essentially an ultra-low latency SSD, but I read conflicting stuff online that AMD doesn't work as well with Intel Optane. Would appreciate if someone could clarify my confusion.

Thank you!
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Optane has PLP (explicit for DC drives, but assumed to be "built-in" even in consumer 900p/905p), low write latency and high endurance, so it is a SLOG of choice. (The next question is then, as always: Do you really need a SLOG?)

Optane is overkill for L2ARC—and then 118 GB may be too small. But if you do get a big drive, testing has shown that Optane has enough oomph for a partitioned drive to serve as both SLOG and L2ARC simultaneously, at least for home/lab use. (Mandatory warning about passing partitions to ZFS rather than whole drives: HERE BE DRAGONS!)

None of the above requires mirroring.

A P1600X as boot drive would be massively wasteful overkill. But I do use 16 GB Optane M10 drives for boot… on account of 1) these having been hoarded at 9.99E a piece, 2) NVME (3.0x2) so saving a precious SATA port for HDDs, 3) just big enough for boot, and 4) not really suitable for anything else as these lowly M10 units somewhat lack in throughput and endurance for SLOG duty.

Optane would be overkill as special vdev or "VM/application pool", but possibly of right size. Mirror the VM pool if it would not be acceptable to recreate from scratch or restore from periodic backup on a HDD pool. Definitely double or triple for a special vdev!

Optane as "super SSD" is indeed platform agnostic. I suppose that the comment about AMD refers to the use as "fast cache" for HDD (what the M10 were sold for to consumers): At least from a commercial point of view, this didn't work on Intel platforms either!
 

CookieMonster

Dabbler
Joined
May 26, 2022
Messages
34
Optane has PLP (explicit for DC drives, but assumed to be "built-in" even in consumer 900p/905p), low write latency and high endurance, so it is a SLOG of choice. (The next question is then, as always: Do you really need a SLOG?)

Optane is overkill for L2ARC—and then 118 GB may be too small. But if you do get a big drive, testing has shown that Optane has enough oomph for a partitioned drive to serve as both SLOG and L2ARC simultaneously, at least for home/lab use. (Mandatory warning about passing partitions to ZFS rather than whole drives: HERE BE DRAGONS!)

None of the above requires mirroring.

A P1600X as boot drive would be massively wasteful overkill. But I do use 16 GB Optane M10 drives for boot… on account of 1) these having been hoarded at 9.99E a piece, 2) NVME (3.0x2) so saving a precious SATA port for HDDs, 3) just big enough for boot, and 4) not really suitable for anything else as these lowly M10 units somewhat lack in throughput and endurance for SLOG duty.

Optane would be overkill as special vdev or "VM/application pool", but possibly of right size. Mirror the VM pool if it would not be acceptable to recreate from scratch or restore from periodic backup on a HDD pool. Definitely double or triple for a special vdev!

Optane as "super SSD" is indeed platform agnostic. I suppose that the comment about AMD refers to the use as "fast cache" for HDD (what the M10 were sold for to consumers): At least from a commercial point of view, this didn't work on Intel platforms either!

Thank you...
If I use an old Xeon, as many here suggest, e.g. E5-2697v2, there are no motherboards for that socket that support the M.2 slot, correct?

So, these would be useless? :(
And then if I wanted to still ass SSD cache drive, I would have to get a SATA version and occupy a precious SATA port with it, right?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Thank you...
If I use an old Xeon, as many here suggest, e.g. E5-2697v2, there are no motherboards for that socket that support the M.2 slot, correct?

So, these would be useless? :(

M.2 is simply the form-factor of connecting the device - the actual protocol used is NVMe, and that operates over PCIe (PCI Express) which is certainly available on the v2 Xeon series.

There are a number of options for PCIe add-in cards that present M.2 sockets; they can be as simple as a passive board that plugs a single M.2 card into a single PCIe slot, or one that presents multiple M.2 sockets and either requires the motherboard to support "bifurcation" (splitting the single physical PCIe slot into multiple logical slots) or has its own PCIe switch chip (operating similarly to a network switch)

The passive adaptors can be as little as $15 - an active switching adaptor with PCIe x16 host support and four M.2 slots at x4 each may be several hundred dollars.

Optane would be overkill as special vdev

Minor disagreement with this; Optane is actually the ideal device for special vdevs, as it handles a mixed read/write workload extremely well compared to traditional NAND devices, and has a huge amount of endurance. If you're using special_small_blocks to put file data on them, you wouldn't want to have metadata access slowed down by writes to those small files. The challenge is mostly around the significantly higher cost per GB, which is amplified by the need for redundancy in special vdevs, such as a triple mirror to match a RAIDZ2 data vdev.
 

CookieMonster

Dabbler
Joined
May 26, 2022
Messages
34
M.2 is simply the form-factor of connecting the device - the actual protocol used is NVMe, and that operates over PCIe (PCI Express) which is certainly available on the v2 Xeon series.

There are a number of options for PCIe add-in cards that present M.2 sockets; they can be as simple as a passive board that plugs a single M.2 card into a single PCIe slot, or one that presents multiple M.2 sockets and either requires the motherboard to support "bifurcation" (splitting the single physical PCIe slot into multiple logical slots) or has its own PCIe switch chip (operating similarly to a network switch)

The passive adaptors can be as little as $15 - an active switching adaptor with PCIe x16 host support and four M.2 slots at x4 each may be several hundred dollars.



Minor disagreement with this; Optane is actually the ideal device for special vdevs, as it handles a mixed read/write workload extremely well compared to traditional NAND devices, and has a huge amount of endurance. If you're using special_small_blocks to put file data on them, you wouldn't want to have metadata access slowed down by writes to those small files. The challenge is mostly around the significantly higher cost per GB, which is amplified by the need for redundancy in special vdevs, such as a triple mirror to match a RAIDZ2 data vdev.

Would PCIe adapter add to the latency thus defeating the main benefit of Optane?
Could you please recommend reliable ones? The chinese adapters that I saw on NewEgg all have meh reviews that say that they bottleneck the NVMEs at reduced speeds (like 1500 MBs or less).

Also... would you recommend keeping these 3 Optane modules or returning (any of) them? Would they benefit my build to justify the $66 /piece = $197 for 3x 118G modules, or would this money be better spent in a different way?

Thank you
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Would PCIe adapter add to the latency thus defeating the main benefit of Optane?
Could you please recommend reliable ones? The chinese adapters that I saw on NewEgg all have meh reviews that say that they bottleneck the NVMEs at reduced speeds (like 1500 MBs or less).

Passive adapters (either single-slot or bifurcating) will add no latency, as it's effectively just a physical form factor change, like putting a 2.5" drive into a 3.5" adapter.

Switching adapters may add some small amount of latency on their own as the packets have to be routed through a PCIe switch chip. Adapters or slots that result in "oversubscribe" the lane count can add significant latency under load. For example, a card with 4x M.2 slots with x4 lanes each, for a total of x16, but inserted into a "physically x16, electrically x8" slot on a motherboard will end up with only x8 lanes to the host.

Also... would you recommend keeping these 3 Optane modules or returning (any of) them? Would they benefit my build to justify the $66 /piece = $197 for 3x 118G modules, or would this money be better spent in a different way?

I took a quick look, but didn't see a "general build thread" in your history, so it's hard to comment - but at first glance, you likely aren't using sync writes that would benefit from SLOG (most common Optane use) and would need a number of adapters and a good use case to introduce a triple-mirror special vdev.

I'd hazard a guess that the ~USD$200 would probably be better spent buying additional RAM or a spare drive. If you can describe the rest of the proposed build and how it will be used, that would be helpful.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Also... would you recommend keeping these 3 Optane modules or returning (any of) them? Would they benefit my build to justify the $66 /piece = $197 for 3x 118G modules, or would this money be better spent in a different way?
That all depends on what you want to achieve…
For now, it's not apparent whether you have any use case for Optane. Assuming you have, and that it would require > 300 GB, a consumer 900p AIC (or U.2 drive, possibly second-hand) may be an easier fit that a bunch of M.2 drives.
 

CookieMonster

Dabbler
Joined
May 26, 2022
Messages
34
That all depends on what you want to achieve…
For now, it's not apparent whether you have any use case for Optane. Assuming you have, and that it would require > 300 GB, a consumer 900p AIC (or U.2 drive, possibly second-hand) may be an easier fit that a bunch of M.2 drives.

It was a Black Friday flash sale, so had to act fast and think later, but originally I thought to use them as overal caching and also some kind of fast buffer, especially for writes -- since writes (unlike reads) are slow regardless of ZFS array size.

So, like if I have a huge file to copy from local device to server (e.g., a phone backup), it would copy it to the Optane first and then slowly (relatively) feed it to the platter boys. I figured Optane with its high endurance would be exactly the solution since frequent big writes could wear out a regular SSD. But I don't even know if this kind of buffering/caching system is a thing in TrueNAS (yet).
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
So, like if I have a huge file to copy from local device to server (e.g., a phone backup), it would copy it to the Optane first and then slowly (relatively) feed it to the platter boys. I figured Optane with its high endurance would be exactly the solution since frequent big writes could wear out a regular SSD. But I don't even know if this kind of buffering/caching system is a thing in TrueNAS (yet).

An SLOG is not a write cache, it's a write log. I've linked this in another thread recently, and perhaps I should convert it to a Resource, as it explains the write throttle and general concepts around the process of flushing the dirty data to disks.

 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
had to act fast and think later
This might be a recipe for success in some circumstances, but I still have not figured out which ones. (And I'm not only thinking about ZFS here…)

You're welcome to describe your system and your use case so we can see if there's any aspect which could benefit from Optane.
 

rigel

Dabbler
Joined
Apr 5, 2023
Messages
19
...
Minor disagreement with this; Optane is actually the ideal device for special vdevs, as it handles a mixed read/write workload extremely well compared to traditional NAND devices, and has a huge amount of endurance.

Hi, I'm considering using this 118GB optane drive for a special VDEV, metadata tables only. However, I'm worried that it will be too small as my TrueNAS grow in future. Is there any advice what size of TrueNAS pool this 118GB drive special VDEV will be enough for?

Also, on another note. I have learned that you can store same metadata on L2ARC separate drive with some "metadata only" settings. Some say it is much better because it is not critical if you lose this L2ARC drives. Whereas special VDEV is another point of potential failure.

So my question is why we need special VDEV if we can use L2ARC separate drive for "metadata only"? Is there any performance benefit?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
L2ARC does not always work well, especially if you do not have enough RAM in the first place. On the other hand if you serve a lot of files to a large group of clients via SMB and your main storage is on spinning drives, a special VDEV on SSDs will of course speed up directory search operations significantly due to their random access nature. And so improve the performance users perceive - open share in explorer, how long does it take until the directory shows up?

Special VDEVs should have the same level of redundancy as the main storage VDEV.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Hi, I'm considering using this 118GB optane drive for a special VDEV, metadata tables only. However, I'm worried that it will be too small as my TrueNAS grow in future. Is there any advice what size of TrueNAS pool this 118GB drive special VDEV will be enough for?

Also, on another note. I have learned that you can store same metadata on L2ARC separate drive with some "metadata only" settings. Some say it is much better because it is not critical if you lose this L2ARC drives. Whereas special VDEV is another point of potential failure.

So my question is why we need special VDEV if we can use L2ARC separate drive for "metadata only"? Is there any performance benefit?
Hello @rigel and welcome to the TrueNAS community.

The size of pool metadata is highly dependent on the type of data being stored, and whether or not it can be stored as larger or smaller records. A large number of small files (eg: 4K-8K) will generate significantly more metadata than a small number of large files (multiple MB).

If you have a sample set of files, or there is data already on your pool, you can use the command below from a shell:

zdb -LbbbA -U /data/zfs/zpool.cache poolname

This command will take some time (and might generate quite a few errors on an actively used pool) but at the end will spit out a long wall of text, including a number of rows similar to the following at the end:

Code:
 5.81K     744M    8.04M      93M      16K    92.48      0.00        L5 Total
 5.81K     744M    8.04M      93M      16K    92.48      0.00        L4 Total
 7.50K     853M    11.1M     120M      16K    76.72      0.00        L3 Total
 40.5K    2.52G     577M    1.95G    49.2K     4.48      0.01        L2 Total
 2.20M     142G    41.7G     126G    57.2K     3.40      0.39        L1 Total
  542M    35.4T    21.6T    31.4T    59.3K     1.64     99.60        L0 Total
  544M    35.5T    21.7T    31.5T    59.3K     1.64    100.00    Total


As an over-simplified rule of thumb, the "L0 Total" is your actual data - everything else above that would be on your metadata vdevs. So in this example, about 0.4% of the pool size.

Regarding the special vdev vs. L2ARC - L2ARC is helpful but not a 100% guarantee that your metadata will be read from an SSD, and also does nothing to improve the metadata writes (if you update large numbers of small files, for example). Whereas special vdevs guarantee that the reads will be coming from SSD in a worst-case scenario, and also assist in the metadata write speed.
 

rigel

Dabbler
Joined
Apr 5, 2023
Messages
19
Regarding the special vdev vs. L2ARC - L2ARC is helpful but not a 100% guarantee that your metadata will be read from an SSD, and also does nothing to improve the metadata writes (if you update large numbers of small files, for example). Whereas special vdevs guarantee that the reads will be coming from SSD in a worst-case scenario, and also assist in the metadata write speed.

Thank you so much for the explanation. I guess I need to treat special VDEV just like other VDEV and take care of all the redundancy required.

It is still a little bit complicated in my opinion. If you add special VDEV from the start of creation of your pool then all metadata goes to special VDEV fast ssd drives. However, if you first use your zpool without a special VDEV, all the metadata is written on storage RAIDZ drives. And then even if you add a special VDEV, some of the metadata will still be stored on regular vdev storage drives. So if you want the best performance you need to add a special VDEV right at the beginning of creation of your zpool, is it correct? Or there is a way to migrate metadata from regular vdev storage drives to high speed special VDEV?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
If there is a special VDEV and if there is sufficient space on it, no new metadata will be written to any other VDEV. So either just wait for the special VDEV to fill or rewrite everything once.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I guess I need to treat special VDEV just like other VDEV and take care of all the redundancy required.
Absolutely. The loss of a special vdev will cause your pool to go offline and be unmountable.
So if you want the best performance you need to add a special VDEV right at the beginning of creation of your zpool, is it correct? Or there is a way to migrate metadata from regular vdev storage drives to high speed special VDEV?
Correct, @Patrick M. Hausen has explained this quite well in regards to only new/updated metadata being written to the dedicated device.

A quick point of caution though - you've mentioned that your main storage is arranged as RAIDZ, so once a metadata or other special vdev has been added to the pool, you won't be able to remove it. Be certain that you have designed sufficiently for this (available slots for mirrored devices, potential to swap one at a time or add a third if you need to replace or add redundancy)
 
Joined
Jan 1, 2023
Messages
16
Absolutely. The loss of a special vdev will cause your pool to go offline and be unmountable.

Correct, @Patrick M. Hausen has explained this quite well in regards to only new/updated metadata being written to the dedicated device.

A quick point of caution though - you've mentioned that your main storage is arranged as RAIDZ, so once a metadata or other special vdev has been added to the pool, you won't be able to remove it. Be certain that you have designed sufficiently for this (available slots for mirrored devices, potential to swap one at a time or add a third if you need to replace or add redundancy)
This is a very important point. Wendell's video (linked in this thread in post #11) doesn't mention this - but if your main pool isn't mirrored, you cannot remove a special vdev.

I also have 2x 118GB Optanes that I bought to use as a special vdev; but my current pool is a raidz2, so until I switch to a mirror in the future I'm just using one Optane as an applications pool and one on my Windows computer as a cache drive + page file + temp folder.
 
Top