SSD choice for TrueNas Scale boot pool

fornex

Dabbler
Joined
Jul 13, 2023
Messages
11
Hello everyone,

I have built my first TrueNas, which I have been enjoying for the past 5 months or so. However, the SSDs that I use for my boot pool have been driving me nuts with some occasional read/write errors, which renders the pool DEGRADED. After trying out many things (see more details below), I decided on using other SSDs. However, I am a bit lost at which SSD to actually choose in order to not run into these exact same errors & problems.

In your opinion, what is the best SSD for a boot pool that is not awfully expensive? This is essentially a home NAS with some more important work-related documents stored on it as well. I have seen some recommendations for the Intel D3-S4520, and the Samsung 860 and 870, with some counterarguments for the latter two as well.

A little more detail:

I use 2 Kingston A400 250GB SSDs in a mirror for the boot pool of my TrueNas Scale installation. The first write error happened roughly 1 month after installation, and then happened every single time after the SMART tests have been run on each Sunday (most of them happened when the catalog sync took place). Then, I changed SATA ports, but the errors kept happening. Then, I replaced the data cables, same thing, and then I replaced the power cables. However, errors still happen to this day. One of the SSDs failed spectacularly with 1000+ write errors and a ton of read errors, which totally bricked the SSD (the other SSD that was in the same mirror reported 1 TB of data written at this point on Disk Drill). I was dumb enough to replace it with a new A400 once again. For a while, the system ran perfectly, passing short and long SMART tests, but about 1 month later, the errors showed up again. The funny thing is that about 90% of the time, the SMART tests pass, especially if I start them by hand but they usually fail during the scheduled tests when the NAS has been running for a while. Essentially, the same things happen as described in this post (errors -> DEGRADED status -> zpool clear -> scrub-> manual tests pass -> then during the scheduled SMART tests, the tests fail -> errors -> boot is DEGRADED -< repeat). I have seen in this post that the Kingston A400 is not the best choice for a boot pool. Therefore, I would like to ask what is in your opinion the best choice for the SSDs?

Another detail:

I moved the System Dataset to the SSD to reduce the noise coming from the NAS in the first month or so. Could this negatively affect the life of the SSD?

Thank you in advance!
 

cjboyle

Cadet
Joined
Sep 5, 2023
Messages
1
I just wanted to note that I've been having a similar experience.

The boot pool on my system is running off a single Kingston A400 120GB SSD for nearly two years, and smartctl tests don't seem to run at all on this disk ("Read SMART Data failed: scsi error badly formed scsi parameters"). However I haven't had any issues whatsoever with the boot pool, so I'm happy to ignore it.

On the other hand, I recently added 2x Kingston A400 240GB drives primarily for the Kubernetes apps and PVC storage, and both have degraded then faulted after 25-28 days. Plugged them in to a Windows system to check them with Kingston's KSM software, which reported no errors found. Reattached to the NAS, cleared the SMART counters, which then degraded again after 3 days. Interestingly, I was also able to reattach/import the zpool directly, so I'm not sure if there was *actually* any data corruption...

Currently starting the RMA process, though in the meantime I've ordered two WD Blue 1TB SSDs, so we'll see if they fare any better...
 

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
Blue are budget SSDs, not ideal, personally my transcend SSD i have as a boot drive has been fine. Samsung is always solid, Kingston usually is, but thinking since the A400 is also budget drive it does not like being in a raid / pool config..
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947

fornex

Dabbler
Joined
Jul 13, 2023
Messages
11
Thank you everyone for your answers!

@cjboyle
I had the exact same results: the SSDs did not report any errors in Windows, and even their health check showed perfectly fine results. Yet, after plugging them back into TrueNAS, I would get errors in at most a week.

Since my original post, I replaced my Kingston A400 SSDs with Samsung 870 EVOs (500GB). So far, there have been no issues with the new ones.
 
Top