understanding ZFS snapshots and clones

TxAggieEngineer · May 24, 2023

I was testing using ZFS replication to get a snapshot of a test dataset over to a new system, then making the transferred data accessible on the new system. I'm a bit fuzzy about how the whole snapshot-dataset relationship functions. When I first transferred the snapshot to the "replication" dataset, everything is as I would expect it:

root@hq-nas-4[~]# zfs get -r origin data
NAME PROPERTY VALUE SOURCE
data origin - -
data/replication origin - -
data/replication@auto-2023-05-24_12-38 origin - -

Then I cloned the snapshot to a new dataset called "new". Everything still makes sense at this point:

root@hq-nas-4[~]# zfs get -r origin data
NAME PROPERTY VALUE SOURCE
data origin - -
data/new origin data/replication@auto-2023-05-24_12-38 -
data/replication origin - -
data/replication@auto-2023-05-24_12-38 origin - -

I want to be able to delete the snapshot but I need to "promote" the "new" dataset, which I did. This shows "new" is no longer linked to the snapshot. Again, that part makes sense:

root@hq-nas-4[~]# zfs promote data/new
root@hq-nas-4[~]# zfs get -r origin data
NAME PROPERTY VALUE SOURCE
data origin - -
data/new origin - -
data/new@auto-2023-05-24_12-38 origin - -
data/replication origin data/new@auto-2023-05-24_12-38 -

The thing that does not make sense is it now shows the "replication" dataset is linked to the snapshot. Why is that? The system says there is a dependent clone when I try to delete the snapshot but "replication" was there before the snapshot was ever copied over so why would it now be thought of as a dependent clone? I thought "promote" would simply "unlink" the dataset from the source snapshot and put everything back to the way it was.

Glorious1 · Jun 2, 2023

Did you figure this out? I'm a bit befuddled by what you're doing. Normally if you just replicate a snapshot to an empty dataset, all the data are there and "accessible". More should not be needed, unless I'm missing something.

TxAggieEngineer · Jun 2, 2023

Maybe I'm making things more complicated than they are. Since the snapshot shows up on the destination unit I assumed that you had to do something with the snapshot to make it usable, like cloning it or something like that. Maybe the data is already there in the destination dataset and it's already usable.

Ericloewe · Jun 2, 2023

The data is indeed there, though you might clone the snapshot to be able to write stuff and still receive new snapshots (but you'd end up with two disjoint branches that can't be re-integrated).

TxAggieEngineer · Jun 2, 2023

Ericloewe said:
The data is indeed there, though you might clone the snapshot to be able to write stuff and still receive new snapshots (but you'd end up with two disjoint branches that can't be re-integrated).

I guess this is the part I don't understand about ZFS replication. Why is there a snapshot of the dataset or volume on the destination device and what is the proper procedure to make a dataset or volume usable on the destination, i.e. in the event of a disaster and I need to bring all of that online?

I've also tried this with an iSCSI volume and I can get as far as mounting the volume but something happens along the way and prevents the guest server from seeing the file system that's in the source volume.

winnielinnie · Jun 2, 2023

TxAggieEngineer said:
Why is there a snapshot of the dataset or volume on the destination device and what is the proper procedure to make a dataset or volume usable on the destination, i.e. in the event of a disaster and I need to bring all of that online?

You cannot replicate (send/recv) a live filesystem. When it comes to replications, you send an immutable snapshot, which is essentially the dataset (filesystem) as it was at the exact moment in time when the snapshot was taken. (An exact replica of the filesystem.)

To "use" a received dataset (from a backup / replication task of its snapshot), you toggle off the dataset's "read-only" option, and then you can have it mount like any other dataset. How recent will the files and folders be? As recent as the snapshot itself. (Anything that was created on the source's side after such a snapshot will not exist on your backup on the destination side.)

Keep in mind that it's assumed you'll be sending the "destination" snapshot back to your original pool, where you can start using it again. Here is where you'll remove its "read-only" option. You don't want to actually use the backup itself (on the destination), since any changes made will be wiped out the next time you run a replication task, which will revert it back to the latest snapshot + any incremental stream from the source side.

TxAggieEngineer · Jun 2, 2023

winnielinnie said:
To "use" a received dataset (from a backup / replication task of its snapshot), you toggle off the dataset's "read-only" option, and then you can have it mount like any other dataset. How recent will the files and folders be? As recent as the snapshot itself. (Anything that was created on the source's side after such a snapshot will not exist on your backup on the destination side.)

Keep in mind that it's assumed you'll be sending the "destination" snapshot back to your original pool, where you can start using it again. Here is where you'll remove its "read-only" option. You don't want to actually use the backup itself (on the destination), since any changes made will be wiped out the next time you run a replication task, which will revert it back to the latest snapshot + any incremental stream from the source side.

Thanks very much for the explanation. I think I'm getting closer to understanding but still a little fuzzy on some things...

It makes sense that the destination snapshot would be immutable or "read-only" but the dataset or volume in the destination shows "false" in the "Readonly" column so it would appear the destination dataset or volume should be usable/mountable/writable.

You said it's assumed you'll be sending the "destination" snapshot back to the original pool... makes sense but if it's the snapshot that would be sent back, why wouldn't I be able to use the "parent" dataset or volume? In other words, if I use the parent dataset or volume and deleted files, made changes, etc. the snapshot created on the destination system is still intact and would I simply "rollback" to that? If there hasn't been some kind of disaster and I go back to the source system's Replication Task and click "Restore", does it pull from the "destination" system's parent or the snapshot that was created during replication?

In a DR situation where the source system is no longer available and I need to make everything usable on the destination system, do I need to clone the snapshots to a new dataset to make them usable?

winnielinnie · Jun 2, 2023

TxAggieEngineer said:
It makes sense that the destination snapshot would be immutable or "read-only" but the dataset or volume in the destination shows "false" in the "Readonly" column so it would appear the destination dataset or volume should be usable/mountable/writable.

It depends on how you configured the replication task. There's an option to set the destination dataset(s) as "read-only".

This read-only property is a dataset property. It is not tied to any snapshot. (A snapshot is always immutable.)

If your dataset (on the destination side) has "false" for read-only, then it's not really a big deal. It's just good practice to set it to read-only upon a completed replication, so as to not accidentally write new files within (which will be wiped out in the next replication received from the source.)

TxAggieEngineer said:
but if it's the snapshot that would be sent back, why wouldn't I be able to use the "parent" dataset or volume?

TxAggieEngineer said:
In other words, if I use the parent dataset or volume and deleted files, made changes, etc.

Where are you doing these things? One the source or destination?

TxAggieEngineer said:
In a DR situation where the source system is no longer available and I need to make everything usable on the destination system, do I need to clone the snapshots to a new dataset to make them usable?

Aside from renaming the pool (root dataset), among other things, no, you don't need to clone anything.

If the destination has backuppool/mystuff@auto-20230601, then you can proceed to use the mystuff filesystem from its current state (June 1st, 2023); since that is its most recent snapshot. If you had a disaster on the source after June 1st, and no further snapshots were sent to the destination since then, it means you forever lose any changes since June 1st.

TxAggieEngineer · Jun 2, 2023

winnielinnie said:
Where are you doing these things? One the source or destination?

All of this is being done on the destination.

Regarding renaming the pool... both the source and destination pools are named "data". Does that make any difference? Does the destination have to have a different pool name? How would you go about renaming the pool? The "Name" field is greyed out in the GUI.

winnielinnie · Jun 2, 2023

TxAggieEngineer said:
All of this is being done on the destination.

You shouldn't be making any changes or writing new files on the destination side of replication backups.

TxAggieEngineer said:
Regarding renaming the pool... both the source and destination pools are named "data". Does that make any difference? Does the destination have to have a different pool name?

It depends how you recover from a disaster. If the root dataset on the destination has the same name, you could technically just physically use the destination, and then it's back to business as usual. (Physically bringing the drives over to the original server.) Or you can leave the destination pool as it is, and simply replicate everything back to the (newly created) original pool. This new pool will be empty, until you start to fill it with your datasets/snapshots from the destination. This way you get to keep the destination as a sole backup.

TxAggieEngineer said:
How would you go about renaming the pool? The "Name" field is greyed out in the GUI.

The TrueNAS GUI does not support this. You'd have to manually import the pool using a specified name in the command-line. (It's a one-time action.)

TxAggieEngineer · Jun 2, 2023

When I say "disaster", I'm assuming a physical disaster where the none of the source hardware is available. That being the case where the original drives don't exist and I'm only left with what's on the destination system, what would be the process in getting those datasets usable?

winnielinnie · Jun 2, 2023

TxAggieEngineer said:
That being the case where the original drives don't exist and I'm only left with what's on the destination system, what would be the process in getting those datasets usable?

Build a new "main" server, create a new pool (with the same name, preferably), and then replicate from the backup, using the latest snapshots on the backup server's datasets.

Once received on your new "main" server's pool, you're back in business.

The files and folders of your latest snapshot (on the backup pool's datasets) will all exist on your new "main" server. (Whatever existed in that point in time of the snapshot's creation and replication.)

TxAggieEngineer · Jun 2, 2023

O.k., but *how* do I make the datasets available on the new system once the replication is complete? It's not the replication itself; it's making the data available in a usable format once it's replicated.

winnielinnie · Jun 2, 2023

TxAggieEngineer said:
O.k., but *how* do I make the datasets available on the new system once the replication is complete? It's not the replication itself; it's making the data available in a usable format once it's replicated.

It's automatic. TrueNAS will automatically mount any newly created datasets. If not, you can export and re-import the pool after replicating everything back over.

Json_ · Nov 27, 2023

ZFS 克隆文件系统的用途是什么？为什么同一个快照在不同时间（文件系统数据读写、快照大小变化）创建的克隆文件系统不同？

Ericloewe · Nov 28, 2023

If you're not going to post in the Chinese section, it would be helpful if you wrote something most of us can understand...

Json_ · Nov 28, 2023

Thank you for reminding me.I'm sorry that I used English earlier but the browser plugin translated it into Chinese.
What is USED for ZFS clone filesystems.Why are clone filesystems created for the same snapshot at different times (during which the filesystem data is read and written, and the snapshot size changes) different?The following figure shows a cloned filesystem created for the same snapshot:

Thanks!

Ericloewe · Nov 28, 2023

If I'm not mistaken, Clones use the same logic as snapshots for the Used field: It represents the space that would be freed by its deletion. In other words, the unique data not shared with other snapshots. In the case of the clone, the delta since the clone was created.

Json_ said:
Why are clone filesystems created for the same snapshot at different times (during which the filesystem data is read and written, and the snapshot size changes) different?

They seem mostly identical, apart from one that's 2 kB larger. Maybe a small file was changed or something, it's not something I'd spend much time thinking about.

Important Announcement for the TrueNAS Community.

understanding ZFS snapshots and clones

TxAggieEngineer

Dabbler

Glorious1

Guru

TxAggieEngineer

Dabbler

Ericloewe

Server Wrangler

TxAggieEngineer

Dabbler

winnielinnie

MVP

TxAggieEngineer

Dabbler

winnielinnie

MVP

TxAggieEngineer

Dabbler

winnielinnie

MVP

TxAggieEngineer

Dabbler

winnielinnie

MVP

TxAggieEngineer

Dabbler

winnielinnie

MVP

Json_

Cadet

Ericloewe

Server Wrangler

Json_

Cadet

Ericloewe

Server Wrangler

Similar threads