Snapshots question

urobe · Jan 24, 2019

Hey everyone,

so far we had one windows server that was running our office software and which also served as a NAS for user files.
So far our backup plan was to backup every night to a NAS (non freenas) using Acronis.

We would keep the last 7 backups, 6 backups from the first of each month, and 5 backups made on the first of January.
This way we had a decently old backup in case someone deleted something and didn't recognize it.

The server software has been moved to the cloud, and is connected over the internet. This leaves only the task of a NAS for the server, which is why we're moving to freenas.

My plan was now to use snapshots to get the file history functionality, I had with the single backups. A second freenas is located in the same network 500 m away on a hill, basically off site, connected with wifi bridge, which will be the backup of the first NAS. To achieve this, I would need to use the replication function, if I understand correctly.

My question is now regarding the snapshots. As I understand it, it is meant to define the office hours during which snapshots are taken. This is, for our needs already a bit much, one snapshot a day would be sufficient. So for hour needs I would rather define a time at night to take one snapshot and make that last 7 days.

To get the monthly "backup" I don't really see a way to keep one snapshot per month, the only option I can think of is, to make a snapshot each Sunday for example and make these last 24 weeks.

For the yearly "backup" I have no idea how to set this up.

In general, I have the question where these snapshots are stored.

I hope this is understandable.

Any help or hint is greatly appreciated.
Tobi

Heracles · Jan 24, 2019

Hi Tobi,

Snapshots are in a hidden directory at the root of your dataset. It is a bad idea to go mess with them directly and you are better to rely on the WebUI to do all your work. A dataset can also be configured to have its snapshots visible, but again, I think it is more risk than benefits.

A snapshot is taken instantly in less than a fraction of a second. As such, you do not have to worry about taking them in the middle of a business day. The moment you ensure you have about 1 000 snapshots or less, you are good. To have more than 1 000 snapshots will make it slower and slower to manage these snapshots whenever you will need to clone one or do something else with them.

Because ZFS is copy-on-write, what a snapshot do can be oversimplified as :
--Creation of the snapshot : just create an empty directory and you are done. That is why it is lightning fast.
--After the snapshot is taken, whenever new data are written, do not touch the snapshot
--After the snapshot is taken, whenever data are deleted, "move" them into the snapshot's hidden folder instead of actually deleting them
--After the snapshot is taken, whenever data are modified, "move" the old version of the file to the hidden folder and save the new one in the "main" space

Your previous plan was :
Keep last 7 daily backups
Keep last 6 monthly backups
Keep last 5 yearly backups

To do that :
Create a periodic snapshot task, starting at say 03:30 AM and ending at 04:00 AM every day, taking snapshots at interval of 1 day with a lifetime of 7 days.
Create another one starting at say 04:00 AM to 04:30 AM every Sunday, taking snapshots at interval of 4 weeks and save it 26 weeks.
As for the Yearly snapshot, I do not know if it is possible to use such a long interval. In all cases, you have so few snapshots, you can easily increase the retention of your monthly backups.

As a comparison, I do :
snapshots every 15 minutes for 2 days (4x 24 x 2 = 192 snapshots)
snapshots every hour for 1 week (7x 24 = 168 more snapshots ; 360 total)
snapshots every day for 4 months ( 4x 31 ; 124 more ; total 484 snapshots )
snapshots every week for 1 year (52 more ; 536 total)
snapshots every 4 weeks for 4 years (52 more ; 588 total)

Snapshots are not actual backups, so to replicate the data and snapshots to a secondary NAS offsite is a good thing. Technically, they are not completely offline, but that risk can be accepted (I do the same ; I accepted the risk).

Good luck configuring your own setup,

urobe · Jan 24, 2019

Thank you very much for your very good and extended answer!

Your snapshot plan seems to be very reasonable and well thought trough. I think I might adapt it.

One more question is on my mind:

Lets assume I have created a file that has been seen by all snapshots tasks, the one that lasts two days, a week, four months, a year 4 years.
If I delete the file, the task which snapshots last for two days will first recognize that the file is gone. Will the "deleted" file only be linked to this snapshot, or also to the other ones? Or asked differently, will the deleted file stay in the snapshot for 2 days or four years?

garm · Jan 25, 2019

Saying files gets moved is misleading..

https://forums.freenas.org/index.php?threads/why-do-snapshots-increase-in-size.70762/

What actually happens when you take a snapshot is that it records the time.
So from the point of view of the file being deleted it stays in the pool for as long as the retention of all snapshots taken between the creation of the file and it’s deletion. If your retention is 4 years, you need storage overhead to handle 4 years of storage usage growth (or a plan to expand storage in that time)

Personally I keep many, but short lived snapshots on “busy” datasets (databases, web applications, document shares) and few but long lived snapshots on “static” datasets (photos, financial, documents).

In any case, keep a solid backup!
https://blog.trendmicro.com/trendlabs-security-intelligence/world-backup-day-the-3-2-1-rule/

For backups I create dedicated snapshots and copy their content to my secondary storage. I then delete the snapshot, it’s just easier to script then to find an appropriate automatic snapshot in my opinion.

Take Nextcloud for instance; what I do is I set the application to maintenance mode, dump the database and create a snapshot of the jail, the static file storage dataset and the database dataset. This takes a few seconds (it’s a small home server). I then return Nextcloud to production mode and the users don’t really notice the interruption. I then copy the content of the snapshots (including the database dump) to a dedicated dataset where I keep snapshots for each backup. The temporary snapshots are then removed and the snapshots on the backup dataset can be sent to tertiary storage.

Heracles · Jan 25, 2019

Hi again,

As I said, the description I gave for snapshots is an oversimplification...

All snapshots refer to each other in such a way that every data exists only once. Should one snapshot expires or be deleted, only content unique to that snapshot will be deleted. Should I stay with the comparison of "moving" data from a place to another, all data are always moved to the oldest snapshot they apply to. Once that snapshot is deleted or expires, the content from it that is referred to by other snapshots is then "moved" to the next oldest snapshots. Only once no more snapshots refer to a data that this data is actually deleted.

At the end, each snapshot is a frozen copy of the data as it was at the moment the snapshot was taken. But instead of having multiple copies of these data for multiple snapshots, they are linked and referred to between all the snapshots.

As for saving snapshots for very long period, indeed you must have the required space for that. As for me, I do have way more space than what I need. Also, I plan to rotate my hard drives very slowly, like one per 9 months, to reduce the risks of hard drive failure. Doing that, I will replace the drives with bigger ones and once all of them have been replaced, ZFS will auto-expand the storage automatically.

garm · Jan 25, 2019

Again, saying files “move” between snapshots is misleading. You can clone any snapshot and see the data in the dataset as it was at that point in time.. there is no data being “moved” between snapshots when snapshots gets deleted, it’s only that timestamp and the corresponding data that gets deleted.

I tried to reuse an older post.. but it goes something like this..

You save a 10 MB file and that gets written to ~80 128kb blocks. You take a snapshot, recording a timestamp. You then modify 5 MB with of the file you have on that dataset, causing 40 new blocks to be written. ZFS checks to see if there are any snapshots with a timestamp newer then the birttime of the 40 old blocks about to be deleted and finds the snapshot you took, and leaves the blocks alone. If you then clone the snapshot you read the 80 original blocks, ignoring the 40 news blocks. Your snapshot now refers 5 MB of data. You then take another snapshot. At some point you decide to remove the file, and it will indeed be “removed” from the dataset as it is read by the users. As ZFS is about to clear the blocks referenced by the file (120 blocks) it checks if there are any snapshots and finds the two snapshots you took. The file is no longer visible in your dataset, but if you clone any of the two datasets you will se the file as it was when the snapshots was taken, the first snapshots showing the original 80 block and the second showing the original 40 and the 40 new blocks. In total you now have 15 MB worth of snapshots. If you delete the first snapshot ZFS checks all 120 blocks and finds 40 it can remove. You can now only see the file as it looked after modification and your snapshot data usage is 10 MB, remove the second snapshot and the remaining 80 blocks gets deleted and your file is now gone from the pool as well.

If no changes had been made between the two snapshots, both would have read the original 80 blocks and thus both snapshots must be removed before the 10 MB of diskspace would be available again.

Heracles · Jan 25, 2019

Hey garm,

I know the detail of block handling by ZFS and how it is used to create all the ZFS features including snapshots. This is why I said the illustration is an oversimplification. Still, it makes it easier for someone coming from Windows filesystem to understand the end result when he does not need to know the details of what is happening under the hood.

What is the best ? To help a new user visualize the functional result he will interact with or to loose him in a low level block handling algorithm that he should not touch or interact with to avoid burning himself ?

In this case, I consider the oversimplification will provide better help to the user. The only thing is for the user to know that it is an oversimplification and I told him first time. Should he wish to deep dive in the details from there, up to him but he is not forced to do it. By taking him right to the deepest level of block handling, you force him to a level he is not familiar with, so you increase the risk of misunderstanding, mistakes, errors and consequences.

So here, what is the consequence of visualizing the data as being "moved" from a snapshot to another, other than not being the exact technical details under the hood ? There is none and the user has been told this illustration is oversimplified, so that there are more to learn about the subject.

garm · Jan 25, 2019

There is no moving of files, that is a misleading statement.. and the OPs follow up question invalidated everything you just said..

Heracles · Jan 25, 2019

We all know that there is now moving of files and you are the only one not understanding what an oversimplification. So now that you showed to everyone how well you know block level handling of data by ZFS, go learn about what an oversimplification is, what it is used for and how important it is to formulate an answer according to the one the answer is provided to.

rvassar · Jan 25, 2019

Sort of misleading, but not really. The reason I say that is because ZFS is a copy-on-write file system. There is copying/moving happening, but its transparent and happens down at a low level.

Say you have a file made of blocks referenced by numbers. The file initially consists of blocks:

1 2 3 4

You take snapshot1 which remembers and references that file as "1 2 3 4".

Your program then edits data in block "2". ZFS reads block "2" in, but writes the changes out as block "5". So now the file after snapshot1 consists of:

1 5 3 4

But snapshot1 still holds a reference to block "2", so it doesn't get erased. You take snapshot2, which now references your file as "1 5 3 4".

You program now alters data in blocks 1, 3, and 4. Again ZFS will create new blocks 6, 7, and 8. So your file post-snapshot2 now consists of:

6 5 7 8

So at this point, you have two snapshots, 3 versions of your file, and 8 blocks in use.

Snapshot1 either ages out, or you delete it. Block 2 now has nothing referencing it, and that space gets freed. If you subsequently delete snapshot2, blocks 1, 3 and 4 will then have no references pointing at them, and they'll get erased, and you'll be back to having only 4 blocks "used".

garm · Jan 25, 2019

And yet, what you actually said is a misrepresentation of what ZFS is actually doing, not merely a simplification. How do you expect a layman to make reasonable estimations of the cost of snapshots when you say stuff like:

Heracles said:
All snapshots refer to each other in such a way that every data exists only once. Should one snapshot expires or be deleted, only content unique to that snapshot will be deleted. Should I stay with the comparison of "moving" data from a place to another, all data are always moved to the oldest snapshot they apply to. Once that snapshot is deleted or expires, the content from it that is referred to by other snapshots is then "moved" to the next oldest snapshots.

There is no data moved between snapshots, as snapshots are “only” timestamps, they don’t reference anything other then a point in time, definitely not each other..

what you see when you clone them is the data as it was at that point in time. But you don’t look at a copy of the file “moved” to the snapshot. ZFS kept all the datablocks of that file and as it was edited the old data being replaced was never removed, hens you are able to look at an earlier version of the file (and in extension the entire filesystem).

There is enough misunderstanding about ZFS out there that we shouldn’t foster it in this forum of all places..

When dealing with any Copy-on-write file system for reliable storage over time I expect the admin to grasp some key aspects, such as the fact that once a block has been written it. Will. Not. Change. All you can do is delete it, and that is the magic of reliable long term storage, you (and by that I mean your system) knows exactly what each block should contain. And if it reads something else, it can be fixed given there is redundancy. Unless this is known (together with a few other things..) on a 33k foot level, you end up making bad decisions and loose your data.

If you want a useful, simplified, view of snapshots you can take a look at this presentation https://youtu.be/MsY-BafQgj4
Skip to 45:30 to jump directly to snapshots.

urobe · Jan 26, 2019

Thank y'all for your responses. I think I understand now the concept behind the snapshots decently well.

I also read the article about the 1-2-3 rule, which does make sense. I was planing on using our old backup system (three synology NAS) as a backup of the current data, basically without the snapshots. They would be not as off site as the second freenas, but at least at the other and of a rather long building.
However, this would mean that I have harddrives only as backups - I would get very most of our data, at the very least the really sensitive one on a compressed LTO-6 band. Do you think this would be necessary? Or advisable? The second freenas wouldn't see too much load, so this could be backed up to a tape once a week.
My understanding is, that one doesn't see the replicated data by default on the second freenas. Is it possible to make it visible for the purpose of having another computer write this data to a tape, or would that compromise the very idea of the replication?

rvassar · Jan 26, 2019

urobe said:
My understanding is, that one doesn't see the replicated data by default on the second freenas. Is it possible to make it visible for the purpose of having another computer write this data to a tape, or would that compromise the very idea of the replication?

You should probably read up on zfs send and zfs receive. These get used in conjuction with snapshots, and carry the filesystem metadata, etc... You can send entire pools (as a top level snapshot), or incremental snapshots. I don't know what support exists in FreeNAS for LTO tape, but the original Oracle ZFS could dump to a tape device or even a file. It's very much like the tar command in that regard. You're on your own for splitting your data set across multiple cartridges.

On Edit: There used to be some best practices guidelines of how to handle snapshots in backup scenarios. And I believe the general rule was to dump in smaller sets and not try to carry a large batch of snapshots. But it's been several years, and I can no longer quickly locate that guidance.

Important Announcement for the TrueNAS Community.

Snapshots question

urobe

Contributor

Heracles

Wizard

urobe

Contributor

garm

Wizard

Heracles

Wizard

garm

Wizard

Heracles

Wizard

garm

Wizard

Heracles

Wizard

rvassar

Guru

garm

Wizard

urobe

Contributor

rvassar

Guru

Similar threads