DDR5 “ECC” and ZFS

LIGISTX · Jan 4, 2022

I am planning on a homelab upgrade once DDR5 prices settle into something more normal (scalping really does blow…), but was curious on what the merits of DDR5’s internal checksumming was vs what I believe is called the current standard of full “chip to RAM and back” checksumming verification.

From my understanding, the scrub of death is more or less debunked, although with that said ECC is obviously always preferred. DDR5 introduced on memory error correction, but by default in a consumer board (which I was planning to use, possibly with a i5 12600 for example, didn’t want to start introducing P and E cores into ESXi or really any server OS yet as I doubt they will know how to handle that well…..) this is not exactly akin to current ECC implementation.

Would anyone recommend heavily against this and try to instead find a used system with tried and true DDR4 ECC? With the used market in its current state, I have not had much luck finding any decently priced Xeon/mobo/RAM packages on eBay. I don’t need much, current homelab is a i3 and 28GB of ECC, but I’m really starting to run up against a lack of RAM on this system but have been holding out until DDR5. I have Ubuntu server VM’s choked down to 1.2GB of RAM just trying to not over allocate, thankfully that hasn’t posed an issue…. Yet.

jgreco · Jan 4, 2022

LIGISTX said:
the scrub of death is more or less debunked

Be careful where you get your information. There's a very real problem in that if a ZFS pool becomes corrupted, there really may be no way to repair the pool. There is no chkdsk, no fsck. And many people have ZFS pools over 100TB, which isn't exactly easy to back up to a temporary location. We do know that there are potential vectors for non-ECC to cause corruption. If you want to argue that it's unlikely, that's fair. I've driven half a million miles in the last decade and haven't gotten into an accident. The seat belt I wear isn't there to protect me against those non-problematic miles. In that same way, yes, you may be able to get away with non-ECC memory in many or most cases, but that does not translate to an actual debunking of the additional safety.

LIGISTX said:
instead find a used system with tried and true DDR4 ECC?

I think the argument here isn't the "DDR4" bit, but, rather, the "used system". NAS does not require the latest and greatest CPU/mainboard/fastest memory. You're usually better off finding older gear that is no longer commanding those "fastest/bestest" hardware price premiums.

Quite frankly, we're still using DDR3 based systems in many roles. The development cluster here has more than a terabyte of DDR3 available. For a NAS, you might want to look at lower speed DDR4 options as they get flushed out of data center roles. If I had a choice between 64GB of the very best, fastest, latest DDR5, and 128GB of middlin'-speed DDR4, I would very often choose the DDR4 unless I truly had an application that required the fastest RAM.

LIGISTX · Jan 4, 2022

jgreco said:
There's a very real problem in that if a ZFS pool becomes corrupted, there really may be no way to repair the pool. There is no chkdsk, no fsck. And many people have ZFS pools over 100TB, which isn't exactly easy to back up to a temporary location. We do know that there are potential vectors for non-ECC to cause corruption.

Thats fair... There are certainly ways of filesystems getting corrupted, but I had thought the idea of a flipped bit in RAM causing a SCRUB that would nuke your data was more or less debunked. I am sure there are many other ways of causing corruption, though.

jgreco said:
If you want to argue that it's unlikely, that's fair.

I guess what I am trying to determine is just how likely it actually is, and how much the ECC implemented in consumer DDR5 negates it. From my understanding, there is still the potential for a bit flip when being written or read by the CPU and while in transit, but while the data sits in RAM it will be protected by the on die ECC of DDR5, right? So this sort of partly mitigates the potential? Or is most of the issue actually when being written/read/in flight?

jgreco said:
I think the argument here isn't the "DDR4" bit, but, rather, the "used system". NAS does not require the latest and greatest CPU/mainboard/fastest memory. You're usually better off finding older gear that is no longer commanding those "fastest/bestest" hardware price premiums.

My NAS is a full up homelab, by full up I just mean I have truenas virtualized under ESXi, along with many other guests. I am really the only user, and I do not do much on it... an older socket 2011/DDR3 system would likely suffice, but the power draw and heat output are things I am looking to avoid seeing as this system lives in my closet. I likely could get a DDR3 based system for a decent price, but I don't think I would want to go that old for the reasons above.

jgreco said:
For a NAS, you might want to look at lower speed DDR4 options as they get flushed out of data center roles. If I had a choice between 64GB of the very best, fastest, latest DDR5, and 128GB of middlin'-speed DDR4, I would very often choose the DDR4

100% agree, I don't really care about the speed of the RAM, I am frankly running single channel (I think, seeing as I have miss-matched RAM sizes....) DDR4 2133 without any issue, on a piddly little i3 with only 4 threads. I mostly just need more RAM (and a few more CPU cores), but I am struggling to find anything that really fits the bill for anywhere near affordable. Ideally 8-12 threads would be plenty enough seeing as my current 4 threads is "fine", but I would like to get to at least 64GB of RAM...

I guess at the end of the day, I have no need to go DDR5 besides its built in ECC which I was more or less banking on to circumvent the need to get actual ECC RAM. I have no issue staying on a old gen DDR4 platform, I am just struggling to figure out what makes sense to buy. I run an ATX full tower case, and my current HPE Proliant ML10 Gen 9 motherboard doesn't have much in the way of upgrade path.... I have found it very difficult to find ECC RAM that will work in it seeing as it won't use registered DIMMs. It can support 64 GB of ram which honestly would be enough, but I would need to dump all of my current RAM as its current 3x8GB and 1x4 GB, and getting 4x16GB of RAM that works on the mobo looks to be exceedingly expensive.

Do you have any recommendations?

jgreco · Jan 4, 2022

LIGISTX said:
I had thought the idea of a flipped bit in RAM causing a SCRUB that would nuke your data was more or less debunked

Well, you did specify "less", so, I can agree, it's "less" debunked. How likely is it for a single bit flip to cause damage? It's really hard to say. This isn't just a matter of "scrub of death", which is sort of bull$#!+, but it's used to explain a complex topic to minds unused to the concept. So let's say you have a filer that is up for a year at a shot. You have some bits of high value metadata that get read into the ARC in January, and because the contents are not CHANGED frequently, but are ACCESSED frequently, the ARC-cached block remains in core until November. However, at some point in July, a cosmic ray hits the RAM and causes a bit flip. I think we can all agree that if this happens to hit a directory block, and changes a filename from "bat" to "cat", this isn't a catastrophic change. However, ZFS has no way to detect that bit flip, so when you add another file to that directory in November, causing the ARC-cached block to be written back to disk, you're now permanently stuck with "cat". If we define this as "harmless", then we can agree that bit flips are harmless. Except that it could be OTHER metadata, more critical metadata, such as the list of blocks that make up a directory, that gets corrupted. So, basically, this is a point many people have argued over, pointlessly, for a long time. It's a boring argument and I can't be arsed to care about so-called debunkings. Compsci is *science* and reality wins over opinion.

LIGISTX said:
just how likely it actually is,

One of the advantages to ECC is that it lets you know just how likely memory errors are. Answer: They're rare, and a true random bit flip is highly unusual. It's much more common when a memory module is getting near the point of failure, or some schmuck has used the wrong memory, etc.

LIGISTX said:
much the ECC implemented in consumer DDR5 negates it.

My feeling is that the largest risk is in long-term data stored in ARC. That's opinion, though. My opinions are typically pretty good because I derive them from facts where possible, but, still, opinion.

LIGISTX said:
Or is most of the issue actually when being written/read/in flight?

Well, we can look to non-memory subsystems for instructive education. If you fail to properly cool an LSI HBA, for example, and it starts to overheat, one of the repeating themes in the forums are that data is corruptible as it passes through an HBA, and, since ZFS does not verify the data it has written, a pool can be killed by an HBA spamming garbage (which is really just large quantities of flipped bits) at it. So you really do want end-to-end data integrity. We hopefully agree that this does not consistently happen in modern computing systems, and this is a tragedy.

LIGISTX said:
power draw and heat output are things I am looking to avoid seeing as this system lives in my closet. I likely could get a DDR3 based system for a decent price, but I don't think I would want to go that old for the reasons above.

Once you get into the Sandy/Ivy Bridge systems, the watt burn substantially improves. After that, it has been a gradual evolution over the last ten years. It was really the Nehalem and Westmere that were nightmare space heaters. However, of course, comparing a contemporary system to a ten-year old Sandy system will still have the modern system being a fair bit more efficient.

LIGISTX said:
Do you have any recommendations?

Nothing in particular. For operations here, I'm kinda riding out our legacy investment in Sandy/Ivy systems and DDR3, waiting for DDR5 to come around, and for that to make DDR4 to become cheaper on the used market. This probably gives you some idea as to a possible strategy if you can afford to wait.

Ericloewe · Jan 4, 2022

LIGISTX said:
how much the ECC implemented in consumer DDR5 negates it

Anywhere between "not at all" and "a tiny bit". It's not there as a feature, it's there to get yields up to an acceptable margin.

LIGISTX · Jan 4, 2022

jgreco said:
My feeling is that the largest risk is in long-term data stored in ARC. That's opinion, though. My opinions are typically pretty good because I derive them from facts where possible, but, still, opinion.

Ericloewe said:
Anywhere between "not at all" and "a tiny bit". It's not there as a feature, it's there to get yields up to an acceptable margin.

Based on jgreco's opinion of the largest risk being in long term ARC.... wouldn't the DDR5's on die ECC "negate" this risk? Seeing as the data being held within the RAM will be error corrected at all times. So a random bit flip from a solar ray hitting a RAM module should be corrected by the RAM itself, thus ARC is preserved (not in flight... but in storage over months). I will admit a solar ray hitting a trace as data is in flight would cause an uncorrectable error, but..... could that not just as easily happen over the SATA cable, through the CPU itself, through the HBA, etc? Maybe some of those examples are checked and corrected for by things I do not fully understand, but it sounds like the long term storage of ARC entries would be protected.

Please let me know if I am incorrect, as I am starting to think through upgrade paths here, and if consumer DDR5 isn't it, I need to re-adjust my expectations. Again, this is "just" a homelab, but I still plan on doing things smartly.

jgreco · Jan 4, 2022

@Ericloewe was, I believe, trying to make the point that the purpose of this consumer DDR5 "ECC" is necessary in order to get sufficient usability out of the memory; this would imply it has higher error rates due to various factors. It would be protecting data at rest, not in transit on the memory bus. This is certainly an improvement over the status quo, but is it enough?

There are many places in the PC architecture where the integrity of data can be called into question.

My specific concern with ARC would be that it has a potential for holding onto a block of data for a long time, and then rewriting the checksum as it is rewritten to disk. This means that a corrupted block is issued a valid checksum, making the corruption undetectable.

When sectors of data are flying through the SATA cable, CPU, HBA, etc., these sectors are assembled into a block and the checksum is verified. So while data corruption can be introduced via any of these components, the corruption should be detected because the checksum won't match.

It is due to the role of the ARC, and the fact that its contents have already been checksum validated, where there seems to be the most room for trouble. You read the data from disk, put it in memory, checksum validate it, and then make this a block of ARC data. This is all safe. It's what happens next that is concerning ... if the data is corrupted in memory, the ARC still trusts it, and, worse, if the block is updated, actually writes the damaged block back out to disk with a valid checksum.

I think if you look at that description, it should be clear that the memory itself does need to be trustworthy to store data over a long period, but to be truly resistant to corruption, you need to protect the entire data path between when a checksum is validated and a block placed into the ARC, all the way to when a block's new checksum is calculated. This covers the RAM, the memory bus, the cache, and the CPU, from what I can tell.

LIGISTX · Jan 4, 2022

jgreco said:
@Ericloewe was, I believe, trying to make the point that the purpose of this consumer DDR5 "ECC" is necessary in order to get sufficient usability out of the memory; this would imply it has higher error rates due to various factors. It would be protecting data at rest, not in transit on the memory bus. This is certainly an improvement over the status quo, but is it enough?

There are many places in the PC architecture where the integrity of data can be called into question.

My specific concern with ARC would be that it has a potential for holding onto a block of data for a long time, and then rewriting the checksum as it is rewritten to disk. This means that a corrupted block is issued a valid checksum, making the corruption undetectable.

When sectors of data are flying through the SATA cable, CPU, HBA, etc., these sectors are assembled into a block and the checksum is verified. So while data corruption can be introduced via any of these components, the corruption should be detected because the checksum won't match.

It is due to the role of the ARC, and the fact that its contents have already been checksum validated, where there seems to be the most room for trouble. You read the data from disk, put it in memory, checksum validate it, and then make this a block of ARC data. This is all safe. It's what happens next that is concerning ... if the data is corrupted in memory, the ARC still trusts it, and, worse, if the block is updated, actually writes the damaged block back out to disk with a valid checksum.

I think if you look at that description, it should be clear that the memory itself does need to be trustworthy to store data over a long period, but to be truly resistant to corruption, you need to protect the entire data path between when a checksum is validated and a block placed into the ARC, all the way to when a block's new checksum is calculated. This covers the RAM, the memory bus, the cache, and the CPU, from what I can tell.

Guess I need to really start looking for some used early DDR4 type systems... I really am not coming across much though :/

jgreco · Jan 5, 2022

I think it's going to be maybe two years before we start seeing some attractive deals.

Wolverine2349 · Oct 17, 2023

I am new to TrueNAS for DIY NAS and have researched myself and have found different answers.

Some say the scrub of detah is debunked and here you jgreco say iy is not ZFS has no chkdsk, no fsck.

The strange thing is I have read elsewhere that ZFS actually protects more against failures than others. But if chkdsk, no fsck, how could that be. And supposedly ZFS is very reliable yet why wood it not have those things. Is it because it is copy on write.

If not using ECC RAM, is it better off to use a NAS distro with EXT4 like OPenMediaVault and does EXT4 have something like chkdsk or fsck.

Or how about BTRFS??

This is all SATA SSD build over 10Gbe, so are speed advantages of memory caching of ZFS and BTRFS negated by the much faster SSD speeds than HDD spinners?

Ericloewe · Oct 17, 2023

Wolverine2349 said:
The strange thing is I have read elsewhere that ZFS actually protects more against failures than others. But if chkdsk, no fsck, how could that be. And supposedly ZFS is very reliable yet why wood it not have those things. Is it because it is copy on write.

It's not a simple discussion, but I'll try to boil it down to the following: ZFS operates such that it atomically transitions between fully-valid states. The likes of chkdsk and fsck mostly deal with dangling references, incomplete operations, that sort of thing. I would argue that there is nothing either of those could fix that ZFS isn't already making sure is correct at every single step.

Wolverine2349 said:
If not using ECC RAM, is it better off to use a NAS distro with EXT4 like OPenMediaVault and does EXT4 have something like chkdsk or fsck.

You are never better off using the likes of ext4. But it is weird to focus so much on ECC, if you care about your data, always use ECC.

Wolverine2349 said:
Or how about BTRFS??

The filesystem whose wiki includes pearls such as:

The RAID56 feature provides striping and parity over several devices, same as the traditional RAID5/6. There are some implementation and design deficiencies that make it unreliable for some corner cases and the feature should not be used in production, only for evaluation or testing. The power failure safety for metadata with RAID56 is not 100%.

Yeah, that's not okay on so many levels.

Whattteva · Oct 17, 2023

LIGISTX said:
Would anyone recommend heavily against this and try to instead find a used system with tried and true DDR4 ECC? With the used market in its current state, I have not had much luck finding any decently priced Xeon/mobo/RAM packages on eBay. I don’t need much, current homelab is a i3 and 28GB of ECC, but I’m really starting to run up against a lack of RAM on this system but have been holding out until DDR5. I have Ubuntu server VM’s choked down to 1.2GB of RAM just trying to not over allocate, thankfully that hasn’t posed an issue…. Yet.

I would recommend heavily to go with registered (NOT unbuffered) DDR4 ECC, ESPECIALLY since your 2 principal complaints are:

You're running out of RAM.
DDR5 ECC is robbery.

Second-hand registered DDR4 ECC sticks are cheap and plentiful. You can get a used 64 GB stick for just $50 on ebay. I got myself 224 GB (first system in signature) of it back when it cost $80-90/stick (December 2022) and I thought that was still cheap!

You might have to shop around a bit to get a good Mobo/CPU combo though... unless you go with those Dell/HP racks. Supermicro second-hand supply is rather low. I had to look for at least a month to get mine.

Also, it is nice to just be able to allocate 16-32 GB of RAM to my VM's without thinking too much about it.

Davvo · Oct 17, 2023

@Wolverine2349 quoting the ZFS Introduction by @Ericloewe:

To this end, ZFS is completely Copy-on-Write (CoW) and checksums all data and metadata.
Checksums are kept separate from their blocks, so ZFS can verify that data is valid and that is it the correct data and not something else that your evil disks or storage controller sent your way.
When ZFS detects an error, it immediately corrects it if possible. [...]
Repair tools, such as fsck, do not exist for ZFS as it is supposed to always be consistent.

Introduction to ZFS

This is a short introduction to ZFS. It is really only intended to convey the bare minimum knowledge needed to start diving into ZFS and is in no way meant to cut Michael W. Lucas' and Allan Jude's book income. It is a bit of a spiritual...

www.truenas.com

Wolverine2349 · Oct 17, 2023

You are never better off using the likes of ext4. But it is weird to focus so much on ECC, if you care about your data, always use ECC.

Just on a NAS or even on a gaming PC?

Ericloewe · Oct 17, 2023

I run ZFS on everything that's not Windows (stares at ZFS on Windows), so my answer to that is "yes, as much as possible".

Wolverine2349 · Oct 17, 2023

Ericloewe said:
I run ZFS on everything that's not Windows (stares at ZFS on Windows), so my answer to that is "yes, as much as possible".

So ZFS on everything not Windows and using ECC RAM for it.

On your Windows boxes with NTFS, do you use ECC RAM or no?

danb35 · Oct 17, 2023

Whattteva said:
Supermicro second-hand supply is rather low. I had to look for at least a month to get mine.

My Supermicro X11DPH-T seems to be reasonably available, with a pair of CPUs (and even a little bit of RAM, though not really enough to be useful) in the range of $400-500. Not super cheap, but reasonable, I thought. And I agree that RAM is quite reasonably priced.

Whattteva · Oct 17, 2023

danb35 said:
My Supermicro X11DPH-T seems to be reasonably available, with a pair of CPUs (and even a little bit of RAM, though not really enough to be useful) in the range of $400-500. Not super cheap, but reasonable, I thought. And I agree that RAM is quite reasonably priced.

I suppose it depends on timing and also the model. I was looking around the time just after that whole crypto crash when scalpers just finally stopped hoarding GPU's. I think it was also around the time when electronics supply in general were finally recovering after that whole pandemic global shutdown, so that could be why.

Whattteva · Oct 17, 2023

Ericloewe said:
I run ZFS on everything that's not Windows (stares at ZFS on Windows), so my answer to that is "yes, as much as possible".

Hmm... I run UFS on my FreeBSD VM's (Ext4 on Linux VM's) simply because the hypervisor already runs ZFS and I think it's just kind of redundant and avoids extra overhead of ZFS.

That being said, I do concede that it is easier to generate thin jail templates with ZFS snapshots, but things like Bastille makes that easy even with UFS too.

jgreco · Oct 17, 2023

Wolverine2349 said:
The strange thing is I have read elsewhere that ZFS actually protects more against failures than others. But if chkdsk, no fsck, how could that be. And supposedly ZFS is very reliable yet why wood it not have those things. Is it because it is copy on write.

ZFS does not have things like chkdsk or fsck because it is impractical to do these operations on a tiny computer, even if you think your 128 core 1TB machine is "huge". ZFS filesystems often range into the multipetabyte range. Consider this:

Let's pretend that the hypothetical 128C/1TB system I just speculated about has 4PB of storage. If we were to break that down into 4C/32GB systems, we would end up with 32 separate systems. (4C x 32 = 128C, 32GB x 32 = 1TB). Now if we split the 4PB up into 32 chunks, that'd be 128TB per system. Is 128TB even practical to fsck or chkdsk on a 32GB 4 core system? And if not (it's not), how is ZFS supposed to manage that trick for something 32x larger? You need a HUGE amount of memory and resources and it'd also take... forever. This only gets worse because ZFS has incredible CoW features such as snapshots and clones that substantially increases the metadata complexity, which means the system has to work much harder to validate the correctness of the structure.

ZFS works super hard to protect against introducing failures into the pool in the first place because once an error is introduced, it may be virtually impossible to expunge, especially if it is in something like metadata. This is why ZFS users need to be a little paranoid about things like using well-supported HBA's, and favoring ECC memory, because once crap data is flushed out to the pool, it could potentially do permanent damage to the pool, with the primary recovery option being "use your backups". On the flip side, if you play the ZFS game the way the designers intended, your data is quite safe compared to a conventional filesystem.

Important Announcement for the TrueNAS Community.

DDR5 “ECC” and ZFS

Guru

Resident Grinch

Guru

Resident Grinch

Server Wrangler

Guru

Resident Grinch

Guru

Resident Grinch

Cadet

Server Wrangler

Wizard

MVP

Cadet

Server Wrangler

Cadet

Hall of Famer

Wizard

Wizard

Resident Grinch

Similar threads