Hardware for TrueNAS

SirHenry · Dec 4, 2020

Hi!

I am thinking about building a NAS that is based on this hardware:
- Case Supermicro CSE-846BE1C-R1K23B
- Mainboard Supermicro X11SRM-VF
- CPU Intel Xeon-W-2245, 3.9GHz, 8-Core
- 256 GB DDR4 3200 ECC Reg. 64GB
- NVME M.2 Kioxia XG6 for basic system
- LSI SAS 9400-8i HBA
- Intel X550-T2 NIC
- 20 HGST Ultrastar DC HC550 with 18 TB (SAS-3), using RAID-Z2 oder RAID-Z3
- Kioxia 3.2 TB PM5-V SSD (for L2ARC/ZIL/SLOG)

Does this make sense to you?
There will be big files stored on the device (about 8 - 32 GB each), and they normally will be written once and read multiple times. In some use cases there will be lots of small files written and read, that's why I think a L2ARC could be useful.
The system will be used by 5 users that use lots of bandwidth on a 10 Gbit/s network.

As I haven't built a TrueNAS yet, any comment would be appreciated.

sretalla · Dec 4, 2020

Generally it all looks fine. The boot drive will be overkill, but will do the job.

SirHenry said:
In some use cases there will be lots of small files written and read, that's why I think a L2ARC could be useful.

L2ARC won't necessarily help with all of that. (nothing to do with accelerating writing and costs you writes on your SSD... check the maximum TBW to see how soon you might expect that to die)

If you're serious about SLOG, you will want an Intel SLOG device:

SLOG benchmarking and finding the best SLOG

I'd like to take a few minutes to talk about SLOG devices and what makes good ones versus bad ones. I have no doubt that this will be a controversial topic since this is not well understood by many people. In short, there's 3 things that you...

www.truenas.com

SirHenry · Dec 4, 2020

As most of the data is read-only, I think that would be no big problem.
But if that won't improve overall performance, I am open to suggestions.

HoneyBadger · Dec 4, 2020

SirHenry said:
Does this make sense to you?
There will be big files stored on the device (about 8 - 32 GB each), and they normally will be written once and read multiple times. In some use cases there will be lots of small files written and read, that's why I think a L2ARC could be useful.
The system will be used by 5 users that use lots of bandwidth on a 10 Gbit/s network.

Can you expand a bit on the workflow here? Will each of these users be working on the same large file, or will they each be working on their own? How long will they work on them, and is there a period where these files "age out" and largely move to an archival/warm-storage usage pattern?

Off the top of my head, if the large files can be separated from the small ones into a separate dataset, you can apply some tuning parameters to each separately (eg: recordsize=1M for the large files) to make things more efficient.

I'm not certain if SLOG will actually benefit you here. Understand that no SLOG will ever be as fast as async writes already are, since they "write into memory" in that configuration. If you need the pending writes to be safe (ie: you are working directly with DB's or VM disks on the TrueNAS system) then SLOG is required. In your case though it sounds like you'll be using SMB or similar file-level access.

SirHenry · Dec 4, 2020

In most cases more than one computer (aka "user") works on the same file. This will normally take one day to one week. After that the file will in most cases not used any more.
The smaller files (in most cases on a different CIFS-share) will be used more randomly.
I hope this helps to understand better.

HoneyBadger · Dec 4, 2020

With that setup, I would suggest that you make two pools - one mirror from the SSDs, and one RAIDZ2 (2x 10-drive) from the HDDs. Store your large files on the SSD, and move them off (either manually or via a periodic rsync/copy job) to the HDD pool.

This will give you around 3TB of fast space for your "hot" files, and 200TB for your "cold storage" - before compression and "free space overhead" of course.

Are these video files (that get accessed in big chunks) or is it other data that could be accessed at smaller granularity? That will influence the desired recordsize, but I'm thinking you want to trend higher for big files like this.

Detailed theorycrafting:

ssdpool gets a single dataset of ssdpool/hotfiles

hddpool gets two datasets:
- smallfiles is where you can directly store the smaller files, and they live there with the default settings
- coldfiles is used as the target of the destaging from hotfiles - this dataset could be set to only allow metadata in primary ARC, in order to let your hotfiles and smallfiles datasets take that space (this is assuming that the drives can handle the speed if one of those cold files is needed)

In all cases, I lean towards "skip SLOG" as well.

SirHenry · Dec 4, 2020

The big files are compressed disk images, 3 TB seems not big enough for hot files.
My idea was to use the ssd as cache for lots of small files (in addition to RAM cache), but I don’t know if this really makes sense.

Another question: should I use 4K or 512b disks?

HoneyBadger · Dec 4, 2020

Compressed disk images being read by multiple users ... "digital forensics" or "data recovery" is my guess.

The reason I'm leaning towards the use of an all-flash pool is that it's a way to guarantee (albeit with a little bit of manual curation required) the performance of these images to always be "SSD fast" at a minimum, with them being "RAM fast" if they're in the ARC. If you're using L2ARC, you have to wait for it to warm up, and you can't guarantee that everything that gets written in will immediately wind up there.

L2ARC is good, but it's not a true extension of ARC in that it has no concepts of "most frequently" or "most recently" used - it's just a ring buffer. The last things to fall off the tail end of the ARC buffers get loaded in there, and if L2ARC is getting full it just starts pushing whatever the oldest content is out. First in, first out.

How many images will be worked on at a given time? 3T of space is a fairly sizeable amount relative to 8-32GB files, and you can always shuffle data to/from the HDD-based pool based on utilization. But I do understand if you want the "fully transparent" solution of L2ARC as well.

SirHenry said:
Another question: should I use 4K or 512b disks?

Either is fine as long as it's not a shingled/SMR drive. The defaults in TrueNAS will write a minimum of 4K at a time (using the ashift=12 defaults) so you might get some small efficiency gains with a 4Kn device, but it's not paramount.

SirHenry · Dec 5, 2020

HoneyBadger said:
"digital forensics" or "data recovery" is my guess.

Right guess.

HoneyBadger said:
How many images will be worked on at a given time?

Can't be said. In most cases 2 or 3 images, splitted into chunks of 8 - 32 GB. They are used kind of randomly (user 1 works with image x, user 2 works with image y, there are things to check on image z). Some weeks ago I had an image that was 8 TB alone.

HoneyBadger said:
3T of space is a fairly sizeable amount relative to 8-32GB files, and you can always shuffle data to/from the HDD-based pool based on utilization. But I do understand if you want the "fully transparent" solution of L2ARC as well.

The major problem there is, that the users have no write access to the images due to data integrity reasons. So in you proposal there has to be an admin that moves files from a hidden area to the work area and this crashes the "quite random" work completely.

I really don't expect speed miracles, but I really would like to have the best speed outcome in a reasonable setting. I also thought about using a really big ssd as read cache (there are devices with 13 TB around), but I am not sure that this will improve the situation.

Mostly data is read, the relevant data gets extracted and is written on a different share (not necessarily a different physical location) and at the moment I don't really know how to solve this problem.

HoneyBadger · Dec 5, 2020

I'm going to have to mull this one over a bit and get back to you. Obviously a multi-TB set of hot data is going to be tough to reliably cache. A bigger set of SSDs will obviously hold more but you may still end up fighting the L2ARC fill algorithm for good results.

Regarding the requirements for read-only access, will there be requirements (either soft internal or hard legal) for auditing and tracing? You can obviously address some of the issues with access controls to the share.

SirHenry · Dec 5, 2020

The access control thing is not an issue. We have this running on the system we use now.
if you know you a lot of data used in “WORM“-styl, this may influence NAS design. And right now I really don‘t know if additional caching will speed up the whole thing.

ChrisRJ · Dec 5, 2020

If I understood things correctly, the read access will be largely sequential for the big files. So for those "classic" hard disks (instead of SSDs) would not be that much of problem.

SirHenry · Dec 6, 2020

ChrisRJ said:
If I understood things correctly, the read access will be largely sequential for the big files. So for those "classic" hard disks (instead of SSDs) would not be that much of problem.

I guess this is correct in most cases. When restoring data, it is read from the images in a more random way, I think. But this is not a big problem.

Right now there are two main questions:

Will a properly configured L2ARC/ZIL/SLOG likely improve the overall performance?
If yes: what size would be best?
Are there any problems when choosing and AMD CPU and not an Intel?

Herr_Merlin · Dec 6, 2020

AMD go only if you go for the Epyc. There is a toppic somewhere here where you see, that Ryzen / Threadripper supports ECC but does not report the errors back in a good manner or at all to the OS.
If you want a huge ARC, why not go with Epyc and 2TB of Memory?
On the other hand, you said you will most likely have a huge L2ARC if deployed, why not go with Intel dual Xeon 512GB of RAM and 2 TB of of persistent memory for persistent L2ARC...

SirHenry · Dec 6, 2020

Herr_Merlin said:
AMD go only if you go for the Epyc. There is a toppic somewhere here where you see, that Ryzen / Threadripper supports ECC but does not report the errors back in a good manner or at all to the OS.
If you want a huge ARC, why not go with Epyc and 2TB of Memory?
On the other hand, you said you will most likely have a huge L2ARC if deployed, why not go with Intel dual Xeon 512GB of RAM and 2 TB of of persistent memory for persistent L2ARC...

That is the question. I really have no clue what RAM size would be good and whether L2ARC makes sense.

Important Announcement for the TrueNAS Community.

Hardware for TrueNAS

SirHenry

Cadet

sretalla

Powered by Neutrality

SLOG benchmarking and finding the best SLOG

SirHenry

Cadet

HoneyBadger

actually does care

SirHenry

Cadet

HoneyBadger

actually does care

SirHenry

Cadet

HoneyBadger

actually does care

SirHenry

Cadet

HoneyBadger

actually does care

SirHenry

Cadet

ChrisRJ

Wizard

SirHenry

Cadet

Herr_Merlin

Patron

SirHenry

Cadet

Similar threads