Multiple pools with dataset per pool vs one pool with multiple datasets

ilmarmors

Dabbler
Joined
Dec 27, 2014
Messages
25
Situation: There is need for:
1. one dataset which will have almost write once read many use case, data will change very rarely.
2. one dataset which will be kind of temporary file storage, data will change - written, processed, deleted very often

As I understand from ZFS guides - each dataset is independent ZFS filesystem, but it is not clear to me how pool fragmentation is affected when there are datasets with different usage patterns and different fragmentation behaviour.

Question: Is it better to have one pool with two datasets or two pools with one dataset per pool in case above?

If fragmentation of second dataset doesn't affect fragmentation of first dataset, then it would be easier and more flexible to put both datasets on the same pool. Goal is to be able to fill up space as much as possible (above recommended 80% limit) without critical performance drop or pool crash, which less fragmentation should help, if I understand correctly.

Data is not critical, it is not primary copy, performance shouldn't be an issue - upload only in 20-100MBps range max, but over long time. I see two options, how I could use 36 x 14TB disks to maximize space efficiency:
1. one pool with 3 x 12 HDD disks using Z2
2. one pool with 3 x 11 HDD disks using Z2 (for first dataset) + one pool with 2 HDD in mirror (for second dataset) + 1 HDD as spare disk (for first pool).

Additional question regarding ZFS record size. I have 512e HDD with 4K physical sector size. FreeNAS shows default ashift as 12. Default 128KB ZFS record size would be 32 physical sectors. Do I understand correctly from https://www.delphix.com/blog/delphi...or-how-i-learned-stop-worrying-and-love-raidz and https://docs.google.com/spreadsheet...jHv6CGVElrPqTA0w_ZY/edit?pli=1#gid=1576424058 that it is betterr to have bigger ZFS record size - parity cost for 12 disk Z2 and record size 32 sectors (128KB) is 24%, but for record size 64 sectors (256KB) just 18%. Similarly parity cost for 11 disk Z2 and record size 32 sectors is 24%, but for record size 64 sectors (256KB) just 21%, and for record size 128 sectors (512KB) just 19%? Or I am reading that spreadsheet wrong?

If I have media (image) files (usually multiple MBs, but part can me few hundred KB) on dataset, does it make sense to increase ZFS record size? From what I read 1MB record size is max suggested safe value. Is there much difference going from 128K to 256K, 512K or 1M?

System:
Chassis SSG-5049P-E1CTR36L
Intel Xeon Silver 4214R, 12 cores, 2.4GHz base, 3.5GHz max turbo
4 x 32GB DDR4-2933 2Rx4 LP ECC RDIMM
HDD for OS: 2 x Samsung PM883 240GB
L2ARC: 1 x IntelOPAL D7-P4610 1.6TB NVMe
36 x MG07SCA14TE - Toshiba 3.5" 14TB SAS 12Gb/s 7.2K RPM 256M 512E
 
Last edited:

HarryMuscle

Contributor
Joined
Nov 15, 2021
Messages
161
Did you ever find information that answers your fragmentation question? Interesting question I'd love to know the answer to.

Thanks,
Harry
 

ilmarmors

Dabbler
Joined
Dec 27, 2014
Messages
25
I didn't get clear answer to my questions, so I chose less flexible, but safer (for my goals) option 2. one pool with 3 x 11 HDD disks using Z2 (for first dataset) + one pool with 2 HDD in mirror (for second dataset) + 1 HDD as spare disk (for first pool).

I might be totally wrong, but for reasons I don't remember anymore, it looked like internally as dataset grows it allocates chunk by chunk from pool free space. That means that multiple simultaneously actively growing datasets in one pool will get fragmented. I had option to batch writes to my mostly read only dataset 1, but wasn't sure that it would be enough, so ended up choosing option 2.

Maybe that allocation chunk can be allocated in one big allocation, I don't know. If it could be done in one big chunk, that would allow to minimize fragmentation. Anyway you need to be famillar with too deep technical inner details of ZFS to answer those question on your own. Any incorrect assumption will bite you in the ass.
 

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
You should avoid large vdev with 11 disks because they takes several days to resilver. Decreasing to 6 or 8 disks is a better choice.
 
Last edited:
Top