understanding ZFS snapshots and clones

TxAggieEngineer

Dabbler
Joined
Apr 25, 2023
Messages
16
I was testing using ZFS replication to get a snapshot of a test dataset over to a new system, then making the transferred data accessible on the new system. I'm a bit fuzzy about how the whole snapshot-dataset relationship functions. When I first transferred the snapshot to the "replication" dataset, everything is as I would expect it:

root@hq-nas-4[~]# zfs get -r origin data
NAME PROPERTY VALUE SOURCE
data origin - -
data/replication origin - -
data/replication@auto-2023-05-24_12-38 origin - -


Then I cloned the snapshot to a new dataset called "new". Everything still makes sense at this point:

root@hq-nas-4[~]# zfs get -r origin data
NAME PROPERTY VALUE SOURCE
data origin - -
data/new origin data/replication@auto-2023-05-24_12-38 -
data/replication origin - -
data/replication@auto-2023-05-24_12-38 origin -
-

I want to be able to delete the snapshot but I need to "promote" the "new" dataset, which I did. This shows "new" is no longer linked to the snapshot. Again, that part makes sense:

root@hq-nas-4[~]# zfs promote data/new
root@hq-nas-4[~]# zfs get -r origin data
NAME PROPERTY VALUE SOURCE
data origin - -
data/new origin - -
data/new@auto-2023-05-24_12-38 origin - -
data/replication origin data/new@auto-2023-05-24_12-38 -


The thing that does not make sense is it now shows the "replication" dataset is linked to the snapshot. Why is that? The system says there is a dependent clone when I try to delete the snapshot but "replication" was there before the snapshot was ever copied over so why would it now be thought of as a dependent clone? I thought "promote" would simply "unlink" the dataset from the source snapshot and put everything back to the way it was.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Did you figure this out? I'm a bit befuddled by what you're doing. Normally if you just replicate a snapshot to an empty dataset, all the data are there and "accessible". More should not be needed, unless I'm missing something.
 

TxAggieEngineer

Dabbler
Joined
Apr 25, 2023
Messages
16
Maybe I'm making things more complicated than they are. Since the snapshot shows up on the destination unit I assumed that you had to do something with the snapshot to make it usable, like cloning it or something like that. Maybe the data is already there in the destination dataset and it's already usable.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The data is indeed there, though you might clone the snapshot to be able to write stuff and still receive new snapshots (but you'd end up with two disjoint branches that can't be re-integrated).
 

TxAggieEngineer

Dabbler
Joined
Apr 25, 2023
Messages
16
The data is indeed there, though you might clone the snapshot to be able to write stuff and still receive new snapshots (but you'd end up with two disjoint branches that can't be re-integrated).
I guess this is the part I don't understand about ZFS replication. Why is there a snapshot of the dataset or volume on the destination device and what is the proper procedure to make a dataset or volume usable on the destination, i.e. in the event of a disaster and I need to bring all of that online?

I've also tried this with an iSCSI volume and I can get as far as mounting the volume but something happens along the way and prevents the guest server from seeing the file system that's in the source volume.
 
Joined
Oct 22, 2019
Messages
3,641
Why is there a snapshot of the dataset or volume on the destination device and what is the proper procedure to make a dataset or volume usable on the destination, i.e. in the event of a disaster and I need to bring all of that online?

You cannot replicate (send/recv) a live filesystem. When it comes to replications, you send an immutable snapshot, which is essentially the dataset (filesystem) as it was at the exact moment in time when the snapshot was taken. (An exact replica of the filesystem.)

To "use" a received dataset (from a backup / replication task of its snapshot), you toggle off the dataset's "read-only" option, and then you can have it mount like any other dataset. How recent will the files and folders be? As recent as the snapshot itself. (Anything that was created on the source's side after such a snapshot will not exist on your backup on the destination side.)

Keep in mind that it's assumed you'll be sending the "destination" snapshot back to your original pool, where you can start using it again. Here is where you'll remove its "read-only" option. You don't want to actually use the backup itself (on the destination), since any changes made will be wiped out the next time you run a replication task, which will revert it back to the latest snapshot + any incremental stream from the source side.
 

TxAggieEngineer

Dabbler
Joined
Apr 25, 2023
Messages
16
To "use" a received dataset (from a backup / replication task of its snapshot), you toggle off the dataset's "read-only" option, and then you can have it mount like any other dataset. How recent will the files and folders be? As recent as the snapshot itself. (Anything that was created on the source's side after such a snapshot will not exist on your backup on the destination side.)

Keep in mind that it's assumed you'll be sending the "destination" snapshot back to your original pool, where you can start using it again. Here is where you'll remove its "read-only" option. You don't want to actually use the backup itself (on the destination), since any changes made will be wiped out the next time you run a replication task, which will revert it back to the latest snapshot + any incremental stream from the source side.

Thanks very much for the explanation. I think I'm getting closer to understanding but still a little fuzzy on some things...

It makes sense that the destination snapshot would be immutable or "read-only" but the dataset or volume in the destination shows "false" in the "Readonly" column so it would appear the destination dataset or volume should be usable/mountable/writable.

You said it's assumed you'll be sending the "destination" snapshot back to the original pool... makes sense but if it's the snapshot that would be sent back, why wouldn't I be able to use the "parent" dataset or volume? In other words, if I use the parent dataset or volume and deleted files, made changes, etc. the snapshot created on the destination system is still intact and would I simply "rollback" to that? If there hasn't been some kind of disaster and I go back to the source system's Replication Task and click "Restore", does it pull from the "destination" system's parent or the snapshot that was created during replication?

In a DR situation where the source system is no longer available and I need to make everything usable on the destination system, do I need to clone the snapshots to a new dataset to make them usable?
 
Joined
Oct 22, 2019
Messages
3,641
It makes sense that the destination snapshot would be immutable or "read-only" but the dataset or volume in the destination shows "false" in the "Readonly" column so it would appear the destination dataset or volume should be usable/mountable/writable.
It depends on how you configured the replication task. There's an option to set the destination dataset(s) as "read-only".

This read-only property is a dataset property. It is not tied to any snapshot. (A snapshot is always immutable.)

If your dataset (on the destination side) has "false" for read-only, then it's not really a big deal. It's just good practice to set it to read-only upon a completed replication, so as to not accidentally write new files within (which will be wiped out in the next replication received from the source.)


but if it's the snapshot that would be sent back, why wouldn't I be able to use the "parent" dataset or volume?
In other words, if I use the parent dataset or volume and deleted files, made changes, etc.
Where are you doing these things? One the source or destination?


In a DR situation where the source system is no longer available and I need to make everything usable on the destination system, do I need to clone the snapshots to a new dataset to make them usable?
Aside from renaming the pool (root dataset), among other things, no, you don't need to clone anything.

If the destination has backuppool/mystuff@auto-20230601, then you can proceed to use the mystuff filesystem from its current state (June 1st, 2023); since that is its most recent snapshot. If you had a disaster on the source after June 1st, and no further snapshots were sent to the destination since then, it means you forever lose any changes since June 1st.
 

TxAggieEngineer

Dabbler
Joined
Apr 25, 2023
Messages
16
Where are you doing these things? One the source or destination?
All of this is being done on the destination.

Regarding renaming the pool... both the source and destination pools are named "data". Does that make any difference? Does the destination have to have a different pool name? How would you go about renaming the pool? The "Name" field is greyed out in the GUI.
 
Joined
Oct 22, 2019
Messages
3,641
All of this is being done on the destination.
You shouldn't be making any changes or writing new files on the destination side of replication backups.


Regarding renaming the pool... both the source and destination pools are named "data". Does that make any difference? Does the destination have to have a different pool name?
It depends how you recover from a disaster. If the root dataset on the destination has the same name, you could technically just physically use the destination, and then it's back to business as usual. (Physically bringing the drives over to the original server.) Or you can leave the destination pool as it is, and simply replicate everything back to the (newly created) original pool. This new pool will be empty, until you start to fill it with your datasets/snapshots from the destination. This way you get to keep the destination as a sole backup.


How would you go about renaming the pool? The "Name" field is greyed out in the GUI.
The TrueNAS GUI does not support this. You'd have to manually import the pool using a specified name in the command-line. (It's a one-time action.)
 

TxAggieEngineer

Dabbler
Joined
Apr 25, 2023
Messages
16
When I say "disaster", I'm assuming a physical disaster where the none of the source hardware is available. That being the case where the original drives don't exist and I'm only left with what's on the destination system, what would be the process in getting those datasets usable?
 
Joined
Oct 22, 2019
Messages
3,641
That being the case where the original drives don't exist and I'm only left with what's on the destination system, what would be the process in getting those datasets usable?
Build a new "main" server, create a new pool (with the same name, preferably), and then replicate from the backup, using the latest snapshots on the backup server's datasets.

Once received on your new "main" server's pool, you're back in business.

The files and folders of your latest snapshot (on the backup pool's datasets) will all exist on your new "main" server. (Whatever existed in that point in time of the snapshot's creation and replication.)
 

TxAggieEngineer

Dabbler
Joined
Apr 25, 2023
Messages
16
O.k., but *how* do I make the datasets available on the new system once the replication is complete? It's not the replication itself; it's making the data available in a usable format once it's replicated.
 
Joined
Oct 22, 2019
Messages
3,641
O.k., but *how* do I make the datasets available on the new system once the replication is complete? It's not the replication itself; it's making the data available in a usable format once it's replicated.
It's automatic. TrueNAS will automatically mount any newly created datasets. If not, you can export and re-import the pool after replicating everything back over.
 

Json_

Cadet
Joined
Nov 27, 2023
Messages
2
ZFS 克隆文件系统的用途是什么?为什么同一个快照在不同时间(文件系统数据读写、快照大小变化)创建的克隆文件系统不同?
clone.png
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If you're not going to post in the Chinese section, it would be helpful if you wrote something most of us can understand...
 

Json_

Cadet
Joined
Nov 27, 2023
Messages
2
Thank you for reminding me.I'm sorry that I used English earlier but the browser plugin translated it into Chinese.
What is USED for ZFS clone filesystems.Why are clone filesystems created for the same snapshot at different times (during which the filesystem data is read and written, and the snapshot size changes) different?The following figure shows a cloned filesystem created for the same snapshot:
clone-png.72884

Thanks!
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If I'm not mistaken, Clones use the same logic as snapshots for the Used field: It represents the space that would be freed by its deletion. In other words, the unique data not shared with other snapshots. In the case of the clone, the delta since the clone was created.
Why are clone filesystems created for the same snapshot at different times (during which the filesystem data is read and written, and the snapshot size changes) different?
They seem mostly identical, apart from one that's 2 kB larger. Maybe a small file was changed or something, it's not something I'd spend much time thinking about.
 
Top