Help me understand whats happening with my snapshots...

Status
Not open for further replies.

electricd7

Explorer
Joined
Jul 16, 2012
Messages
81
Hello,

I am trying to understand if my snapshots are working correctly, or if I have a problem. The way I understand snapshots (from a history with Netapp, EMC) is that a snapshot holds the change delta from the time in which it was initiated. So if I have a volume that I snapshot, then add a 50GB video file the snapshot size should be around 50GB (ie its 0 at the time of the snap because no changes, then as something gets written it grows to 50GB). Then lets say I snap it again. After that I add another 50GB video file. So now the original snapshot should be 100GB and the latest snapshot should only be 50GB. Is this correct?

I am asking because I am running a script which makes daily snapshots of a dataset. The command I am issuing is:

zfs snapshot pool1/video@nightly.0

Then after adding a video I run the following:

zfs list -t snapshot

The results are as follows:

NAME USED AVAIL REFER MOUNTPOINT
pool1/video@nightly.0 0 - 807G -

To me the USED column should show me 50GB or so? The USED column remains at 0 no matter how much data i add. Am I doing something incorrectly? Thanks in advance!

ED7
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
First of all, there is a snapshot interface in the FreeNAS web GUI that will handle automated snapshots. You can set when and what to capture and how long they should be retained.

Plus, your script seems to always be using the same snapshot name. This will fail on every attempt after the first snapshot is made. Unless you're actually incrementing that suffix number, or perhaps the script destroys nightly.9 and then renames the other snapshots before creating the new nightly.0? Either way, take a look at the FreeNAS provided snapshot tasks (look under "Storage" in the GUI).

Anyway, I think you have your understanding backwards. If you make a snapshot and then add data, the new data will not be in the old snapshot. If you then make a new snapshot, the new data will be in the new snapshot, but not the first. Now, things also get complicated because the snapshot isn't considered to "use" data if that data is also referenced somewhere else, such as the current state of the filesystem, or another snapshot.

Below, I've put together a scenario where every day you add 50 GB of new data and delete 25 GB. You might see something like the following (assuming you're renaming the snapshots, as I question above, such that nightly.0 is always the most recent snapshot):

Code:
NAME USED AVAIL REFER MOUNTPOINT
pool1/video@nightly.0 0 - 1.0T -
pool1/video@nightly.1 25 - 975G -
pool1/video@nightly.2 25 - 950G -
pool1/video@nightly.3 0 - 900G -
pool1/video@nightly.4 0 - 850G -
pool1/video@nightly.5 25 - 825G -
pool1/video@nightly.6 25 - 800G -


So, most days you have a net increase of 25 GB (added 50, deleted 25), which is shown in the "Referenced" column. The "Used" column shows 25 GB for ever snapshot, except the most recent. Essentially, this is reporting the amount of data that is only referenced by this snapshot (in other words, what data was present when the snapshot was taken, but was deleted before the next snapshot). Now, on the days when nightly.2 and nightly.3 were created, you added 50 GB of data, but didn't delete anything. Because of this nightly.3 and nightly.4 show "Referenced" data increased by 50 GB, but they don't "Use" any space because all of the data that they is still on disk in the filesystem. If you were to delete the files that were new to nightly.4 at this point, the "Used" column wouldn't change, because the data would be referenced by both snapshots. If you then destroyed nightly.3 or nightly.4 you would see the "Used" column increase to report the size of the destroyed data.

It's a bit complicated conceptually, but the easiest way to think about it is that the snapshot allows you to rollback to how the filesystem looked when the snapshot was taken. The referenced data is therefore the same as what the filesystem took up at that time. The used data is how much is unique to that snapshot. This does make things difficult when trying to free up space; you'll know how much is freed by each snapshot, but the size of other snapshots may change each time you delete one.

Here's more info on snapshots: http://docs.oracle.com/cd/E19253-01/819-5461/gbiqe/index.html
 

electricd7

Explorer
Joined
Jul 16, 2012
Messages
81
Thanks for the detailed response! I should have been more clear on what my script was doing, but yes I am deleting the oldest snapshot first and renaming each remaining snapshot to increment its name by 1 day. Newest snapshot is always .0. I know that the GUI has a process for creating automated snaps, but I have created a script that creates quiesced VMware snapshots of the machines on that volume first, then snaps the freenas filer, then deletes the VMware snaps. If you are familiar with NetApps SMVI process, this is what i am replicating with my script. I also confused my initial post up in the way I was explaining my understanding of snapshots...I do realize that data created after the initial snap won't be included until the next snap. I was just concerned because I wasn't sure how to interperet the output of my zfs list -t snapshot command. I think *maybe* i get it now, but will definitely read up on the link you have provided. Thanks again for your response, it really does help me (and I hope others as well!)
 
Status
Not open for further replies.
Top