What are your thoughts on separate pools for critical and non-critical data?

Obi-Wan

Dabbler
Joined
Dec 28, 2018
Messages
17
My data consists of some that is critical, the stuff that is important to me, and some that I deem non-critical, the stuff that I don't necessarily have any backup of. It would be inconvenient to lose this data, but it wouldn't really affect my life much. (In fact, a part of me thinks it would be kind of healthy to lose it, but that's a different discussion!)

The non-critical data makes up the bulk of everything, and I think this will always be the case. When talking number of bytes, the unimportant data will always be much larger than the important stuff.

Which leads to the question! What are your thoughts on having two pools in this scenario?

The primary reason for doing so is the all-important fact that if any vdev in a pool goes down, the pool goes down with it. If I end up with several vdevs in a pool, but most of the data is non-critical, splitting the data in two pools seems like a reasonable way to mitigate that risk.

This would also allow me to have different levels of redundancy on the two types of data. Say two disks for the important data and a single disk for the unimportant data. Now an argument against this is something I’ve seen a few times on the forum: it is generally not recommended to have large disks in a RAIDZ1 vdev because of the long resilvering time. Part of the reason for doing this would also be to get more data stored on fewer disks for the unimportant stuff, if it is really not recommended to have single-disk redundancy for large disks, meaning I would at least have two disk redundancy for the unimportant data as well, then I would need to have three-disk redundancy on the important data for this point to have any merit. Which is doable of course, but I find that a bit extreme, since I do also have backup.

Another point that I don’t know the value of is if it is somehow easier to backup a whole pool instead of a dataset in a pool. Or of it somehow makes something else easier from a managing point of view.

I think those are most of my thoughts, what are yours?
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hey Obi-wan,

To answer your question, it would be much better if you tell us more about your setup. How many drives do you have ? What size ? How many can you fit in your server ? How big are your non-critical and critical data ?

If your critical data are that small, you may look at a solution to backup them in a cloud like a Google Drive where you can have 15 Gig of storage for free.

I would go with a single pool but different backup strategies. Separate pools will fragment your free space and you will easily end up with non-critical data in the should-be-critical pool. It will also increase the percentage of space used in the pool, decreasing the performance. Performance will also be reduced if you do not benefit from all your drives for most of your transactions.

All mechanisms like zfs send / receive ; snapshots and more works perfectly fine at the dataset level. You can also put minimum and maximum space for a specific dataset, letting you calibrate the size you wish for what should be critical or not.

Reasons for different pools may be :
--encryption
--RaidZx vs Raid10 for different access needs (RaidZ for large sequential access ; Raid10 for IOPS on smaller / random data like iSCSI)
--big difference between disks (large conventional HDD vs small SSD ; ...)

Hope this help feed your thoughts,
 

Obi-Wan

Dabbler
Joined
Dec 28, 2018
Messages
17
Hope this help feed your thoughts

Thanks a lot! It did indeed!

To answer your question, it would be much better if you tell us more about your setup. How many drives do you have ? What size ? How many can you fit in your server ? How big are your non-critical and critical data ?

I added my build to my signature. There are 8 SATA ports on my X11SSM-F and my Node 804 can fit 10 3.5" disks. Currently I have about 200 GB of critical data and the rest of the 4 TBs are almost filled, though the data is located on other devices as well, so I would be able to remove the pool without any loss. My reason for starting with two mirrored disks was that I didn't want to commit to a big RAIDZ array, being new to NAS and not knowing how much storage I will need down the road. I’m a bit more worried now that 1 disk redundancy isn’t enough in the long term.

I was thinking about the dual-pool when thinking about how I'm going to expand. I read about the RAIDZ expansion that is in the works, and got the idea of potentially keeping the two mirrored 4 TB disks for the critical data and adding a new RAIDZ pool for the rest.

I welcome any advice for my own build, but also felt there was very little discussion about having multiple pools and wanted to hear what people’s thoughts are on the matter. I get the feeling that most people on here don’t think it’s worth the effort, but I don’t quite understand why just yet.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi again,

Usually you create different pools when you need different technology. RaidZ is good for long, sequential reads while Raid10 is much better at IOPS and so better if you serve iSCSI.

You may also go for different pools when you wish to active a pool-wide option like encryption on one and not the other. Considering the high risk pool encryption has to turn to a self-inflicted ransomware, I would not recommend you to do it.

For what you describe, two different datasets in the same pool would easily do it. From different dataset, you can have different snapshot policies, do ZFS replication for one and not the other, separate quotas and permission and more.

As for increasing the size of a pool, there are a few option. Auto-expand is one of them. Once you replaced all the drives in a vDev, the size will increase to the new size auto-magically. Ex : once you replaced both of your 4TB drives in your mirror for 10 TB ones (re-silvering between each of them or better, adding the new before removing the old), your mirror will increase from 4TB to 10. The same will be for RaidZ : 5x 4TB in RaidZ2 will offer 12 TB of storage. Once all drives are replaced with 10TB, the RaidZ2 vDev will increase to 30 TB.

You can also add vDevs in a pool. As of now, you have only 1 vDev, a mirror. You can create a second mirror with 2 other drives and add that second vDev to your existing pool. To Mix-N-Match vDevs is possible, like adding a RaidZ vDev to a pool containing a mirror, but not recommended. The reason is, you end up with the minus of each vDev instead of the plus.

Have fun designing your own setup,
 

Obi-Wan

Dabbler
Joined
Dec 28, 2018
Messages
17
Thanks again!

Hi again,
Considering the high risk pool encryption has to turn to a self-inflicted ransomware, I would not recommend you to do it.

This question is a bit on the side, but why do you consider the risk so high?
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
This question is a bit on the side, but why do you consider the risk so high?

A lot of people lost their data when they lost their boot pool. Because they did not had an appropriate backup of the key, they were unable to recover.

You can look at it the other way : How many people recovered their data by just re-installing FreeNAS on a new boot device and imported their pool ? A ton of them! They did not had a proper backup, so they were unable to restore their shares, permissions, services, etc. But that just means a few hours of re-config (max). Should they have used encryption, they would have lose everything.

When considering that encryption only helps when you dispose of old drives, it is almost always a very small benefit in exchange of a gigantic risk.

Just don't do it...
 
Top