Problem with periodic snapshot deletion

Borja Marcos

Contributor
Joined
Nov 24, 2014
Messages
125
I have found a problem when deleting expired periodic snapshots. I noticed that a system had a lot of expired snapshots that
had not been destroyed. After checking everything I determined that it was caused by a couple of old snapshots with clones.

Of course you can't delete a snapshot when it has a clone. However, this system has lots of datasets and the failure to destroy
a couple of snapshots was preventing the deletion of snapshots on completely unrelated datasets which had no clones.

I think this is wrong. If I have made a clone of a periodic snapshot (or if, for instance, I place a hold on it, although ) it should be still possible
to destroy other expired snapshots for the same dataset, let alone expired snapshots for other datasets.

Of course there is a workaround, renaming the snapshot. But the UI doesn't offer that option either. So, the presence of a single
cloned snapshot will prevent the destruction of expired automatic snapshots even on unrelated datasets and in a somewhat
unpredictable fashion.

I would treat snapshot destruction in a dumber way which actually makes it more flexible. If the destruction of a periodic snapshot fails,
it should just be ignored, at most just logged. In a replication tool I designed several years ago I do just that. Deleting a snapshot can fail
due to clones, holds (I _do_ use holds) but anyway it should be possible to delete other snapshots.

Of course this would cause a problem with the multiple snapshot destruction commands FreeNAS is issuing because if the deletion of one
of the snapshot fails the command is aborted. In that case a workaround would be to check for the presence of clones before adding a snapshot
to the destroy list, or, as I do in my replication too, delete them one by one. I am not sure what is more efficient, reading some properties (and
eventually checking for holds if they end up being used on FreeNAS) for every snapshot considered for deletion or just invoking an independent
"zfs destroy" command for each snapshot and just logging the result.
 
Top