Replicated Snapshots Not Expiring

djs007

Cadet
Joined
Jun 10, 2021
Messages
7
Hello, I have setup two boxes and I am leveraging box a -> box b replication. The replication process works perfect however the snapshots on the source are following a 2 week rotation and the task is set to expire the replicated snapshots "same as source" and are not expiring. I have searched around and I have not found anything similar to this online and would like to know if I am missing something obvious here. This has happened since 12.0 U1 all the way through to my current 12.0 U4.

If anyone can point me in the right direction here I would appreciate it.

Thanks,
/Derek
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Start by looking in /var/log/zettarepl.log on the box that you run the replication task from.

That may already give some clues as to what's going on.
 

djs007

Cadet
Joined
Jun 10, 2021
Messages
7
Thanks sretalla. I had a look and I can see clearly daily that the snapshots are being destroyed on the source, but there is nothing in there around the destruction of the destination - not sure if I should see this here however.

[retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset xxxxx
 

djs007

Cadet
Joined
Jun 10, 2021
Messages
7
Actually if I look at it further I am seeing a note on local snapshots and then the above, so does this mean it is executing destination removal?

[retention] [zettarepl.zettarepl] Retention destroying local snapshots: xxx
[retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset xxxxx
 

djs007

Cadet
Joined
Jun 10, 2021
Messages
7
So I have confirmed that the above is continuing each replication. The only thing that looks off is the fact that the dataset referenced in the second destroy command (if it is remote) is not showing the full path. It is only showing the folder that it is in, however there is a volume and a dataset before it that is not referenced.

Any further ideas on how to see if this is attempting to delete and failing - maybe on the destination side?

Thanks,
/Derek
 

djs007

Cadet
Joined
Jun 10, 2021
Messages
7
Here is some more detail from the logs of the SOURCE, dataset names and IP changed. It clearly appears that it is making an attempt to delete out of date snapshots, but they are not getting deleted. I have the remote dataset set to READONLY but I don't think that would impact the snapshots.

[2021/06/14 00:02:29] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:29] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:29] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:30] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:30] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:30] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:31] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:31] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:32] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:32] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:33] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:33] INFO [retention] [zettarepl.snapshot.destroy] On <Shell(<LocalTransport()>)> for dataset ‘XX’ destroying snapshots {'auto-2021-05-31_00-00'}
[2021/06/14 00:02:33] INFO [Thread-58] [zettarepl.paramiko.retention] Connected (version 2.0, client OpenSSH_8.4-hpn14v15)
[2021/06/14 00:02:33] INFO [Thread-58] [zettarepl.paramiko.retention] Authentication (publickey) successful!
[2021/06/14 00:02:33] INFO [retention] [zettarepl.zettarepl] Retention on <SSH Transport(root@1.1.1.1)> destroying snapshots: []

In the above log the 1.1.1.1, which I replaced, is the IP of the destination box.

/Derek
 
Last edited:

djs007

Cadet
Joined
Jun 10, 2021
Messages
7
Just looking to see if anyone has any other ideas here.

I tried syncing a single dataset and it worked fine with expiring snapshots on both systems, but when I sync multiple datasets it does not work. The "destroying snapshots" line always shows blank and does not delete them on the destination.

I have tried this with a single root dataset and nested datasets without any luck. I can't imagine this is a flaw I am assuming this is a configuration issue.

I have also destroyed and rebuilt the destination replicas without any difference after a full sync.

Any advice at this point would be appreciated.

/Derek
 

djs007

Cadet
Joined
Jun 10, 2021
Messages
7
FWIW - This appears to be resolved now. If anyone else has had this issue. It appears that if I add the "recursive" option for snapshots in the replication task, this issue is fixed. Not sure why this happens as all of the datasets that are replicated do not have any embedded snapshots or datasets and the single dataset that I tested on that worked did not have this option enabled.

/Derek
 
Top