Where are ZFS snapshots actually stored?

Status
Not open for further replies.

jrodder

Dabbler
Joined
Nov 10, 2011
Messages
28
I am sure this is a silly question. I was playing with ZFS snapshots. I made a snapshot of a volume that had 139 GB of data. I see the shots happening every hour, and added in some data and saw that the snapshots afterwards reflected that. I then deleted everything on the volume. I can still then clone older snapshots, and all the data is there and accessible. Where is the data actually stored then?
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
The snapshots are not really stored per-se. Think of it this way. Every time you write a block to the disk, a time stamp is written to this block as well, that indicates when the write took place.

Now when you change a file, instead of the old block that contained the file getting overwritten, a new block is written with the file in it, along with a new time-stamp.

The directory entry for the directory containing the file, which itself is also a block with a timestamp, is then also rewritten to point to the new block. This directory change again is to a new block with a new timestamp.

So now a snapshot is nothing more than a timestamp itself. Using the snapshot's timestamp, you can reconstruct what the drive looked like at that time, by looking at all the block timestamps.

There are now some intricacies based on how often you "take a snapshot" that dictate which of the old blocks actually need to be retained. It is easy to see that if a file got written to in 2008 and 2009, but the earliest snapshot that exists on the machine is from 2010, there is no way to get the 2008 version back by reverting to a snapshot, so the blocks associated with that 2008 version can be returned to the pool of free disk space.

There are similar rules which dictate which blocks need to be retained for files that get modified multiple times in between two snapshots.

So you can see that a snapshot initially takes up no space at all. It is when you start modifying files after a snapshot is taken, that blocks that would normally become free disk space, will now need to be retained in order to be able to revert to an older version of the file. These blocks can be released back to the pool when the last snapshot older than the affected block is deleted.

Of course in ZFS all this is implemented in very clever ways so that it doesn't make the entire volume dog slow. :D
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
They are stored in the storage pool that houses all your ZFS filesystems. Snapshots are similar to copies, initially, they take up zero space, but grow as data changes (technical explanations are available).
 

jrodder

Dabbler
Joined
Nov 10, 2011
Messages
28
I do daily backups of my production machines to the FreeNAS. I was playing with hourly snapshots, but it looks like maybe I should be doing daily or even weekly snaps going off of your post? I plan to implement (or try to) a remote backup NAS over the WAN, of the snapshots. I was trying to get an idea of what kind of bandwidth one would expect with that type of scheme.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
If you plan to perform the backup using snapshot replication, it's probably best to do it often, since that will result in the least amount of stuff transferred per replication task.

Unless you keep updating the same small set of files over and over, in which case less frequent snapshots might be best, since you won't be transferring all the intermediate versions of the file. But perhaps you want this, so you can go back to each and every version of the file that has existed.

In general snapshot replication doesn't suffer from the problem that rsync-based backups have, which is that rsync needs to figure out each and every time which files to backup. If you have tons and tons of files and do the rsync very frequently, eventually figuring out which files to transfer takes longer (and causes more disk access) than the actual transfer itself. With snapshots, the file system "knows" what has changed, so you don't have this "figuring it out" delay that you get with rsync.
 

jrodder

Dabbler
Joined
Nov 10, 2011
Messages
28
So I was thinking I could keep the seconday FreeNAS server here on location, until I have my full weekly backups done, the Full and the differentials. Get the two machines synced, and then move it offsite, and set up the ZFS snapshot replication. Is that a correct way to tackle it?
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
I would set up the snapshot replication while you still have the machines next to each other on the network. The initial sync will also be through a snapshot replication which will wipe out whatever is already on the destination drive. This process will take a while (and cause network traffic) depending on how much data you have on the source machine.

Setting up snapshot replication is kind of a pain in the butt because you have to get password-less ssh authentication working. It's probably best to get all this stuff going before you move the target machine offsite.

Also, note there is no such thing as a "full backup" with snapshots. Only the oldest snapshot is "full" if you want to think of it in those terms. Everything else is incremental.
 

jrodder

Dabbler
Joined
Nov 10, 2011
Messages
28
Good deal. Nice to know I'm not totally off base then. I have read about the SSH auth issues, I'm not under a huge pressure to get it finalized so it will be fun. We would like to eventually offer this kind of a setup to clients, for automated backups and automatic offsite sync without 3rd party cloud solutions.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Jrodder,

Here is a more specific answer to your question:

Snapshots of file systems are accessible in the .zfs/snapshot directory within the root of the containing file system. For example, if tank/home/ahrens is mounted on /home/ahrens, then the tank/home/ahrens@thursday snapshot data is accessible in the /home/ahrens/.zfs/snapshot/thursday directory

http://download.oracle.com/docs/cd/E19082-01/817-2271/gbiqe/index.html

EDIT: Of course now having said that I can't find them on my system. I know there was a folder I was able to see them in before....
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Ok, got it figured out. If you have datasets (like I do), the snapshots are stored under the dataset directory in the .zfs/snapshots folder which is hidden. You can make them visible, or just 'cd' into them if you know they are there. The way you make them visible is:

zfs set snapdir=visible zpool/zfilesystem

I was looking for them in the root like I said in my other post, which assumes you don't have datasets.

You can also see all of your snapshots by doing:

zfs list -t snapshots
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Proto, that wasn't really the question, but OK :D. I think I actually answered the OP's question as to where snapshots are stored (or rather explained that they are not stored anywhere).

The .zfs directory and the subdirectories underneath it are virtual file systems that use the timestamped blockdata and the snapshot's timestamp, to recreate a view of the file system as it was at a prior date/time. Nothing under these .zfs subdirs takes up any space on the disk.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Proto, that wasn't really the question, but OK :D. I think I actually answered the OP's question as to where snapshots are stored (or rather explained that they are not stored anywhere).

The .zfs directory and the subdirectories underneath it are virtual file systems that use the timestamped blockdata and the snapshot's timestamp, to recreate a view of the file system as it was at a prior date/time. Nothing under these .zfs subdirs takes up any space on the disk.

Huh??? Sorry Durk, what do you mean that wasn't the question? Both answers are correct, you just didn't explain all of the details. I can assure people reading this that snapshots DO take up space, GUARANTEED, and they can take up quite a bit of space if you don't pay attention to them. BUT, they are not exact copies of everything you snapshot, just the differences of what what been changed since they were created. So if you have a dataset with snapshots enabled and you delete some large files, like 500GB for example, and you expect your free space to be reduced by that amount so you can add new stuff, THEN the snapshot is taking up space, 500GB.... So storing pointers is true, but not releasing the space of the files those pointer reference will use up space.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
I'm sorry, but no...

I don't think you read what I wrote. I quite clearly explained how snapshots take up space and why/when. The OP wondered how snapshots can still be cloned when all the files have been deleted.

The .zfs directories do not come into play here. Those are just nice virtual views of the filesystem at the times indicated by the snapshot date. There are completely unnecessary to the workings of snapshots, and in fact are really there to make stuff like "Previous Versions" work. If you were to do a "du" in one of those directories it would give you a size equal to how large the file system was at the time of the snapshot. This is clearly not the case.

The OP based on his replies to me was clearly interested in how snapshots actually work, and the ".zfs" directory does not factor into the answer to that question.
 

jrodder

Dabbler
Joined
Nov 10, 2011
Messages
28
Going off of the wiki:

http://doc.freenas.org/index.php/Replication_Tasks

It looks like the IP is actually part of the SSH key pair setup process. I guess I should still get it all working locally, and when I move it to the remote site, just redo the SSH key process?

**edit**

BTW, both of your responses were helpful to my question. I certainly could have looked that up with enough digging, but am glad you offered up the simple explanation. :)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Put more simply, ZFS replication is very(!) lightweight except when you add or modify data, then the difference between snapshots is the delta between the two, which is really the best you could hope for anyways.
 
Status
Not open for further replies.
Top