[POSSIBLE BUG] ZFS send/recv cannot send again if stopped

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
The Issue
I notice that ZFS send/recv stopped being able to send data if there was a failure during the transfer. This happens every time, either side loses connection, or I restart the NAS.

Here's an example of what it looks like when it's gonna fail to resend the data:

1698254553016.png


Notice how there's no "[total 56 TiB of 60 TiB]" text. It sits on this forever, nothing happens, and then it errors out. And it will always be the same snapshot.

To fix it, I sometimes delete the snapshot on the sending server. Other times, I've seen the `%recv` snapshot on the other end become busy forever and won't let me remove it, so I restart that NAS, and it might start transferring after that.

This happened today, and I restarted both ends, now it works:

1698254784849.png


My question is how I can avoid this pitfall in the future? I don't want it to keep requiring hard restarts each some this backup fails part-way.

My Configuration

1698254911850.png

1698255005209.png
 

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
Should I be reporting this bug to iX Systems or is it something I should keep to the forum?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
If you believe to be a bug, use the bug reporting system.

I'd try removing replication from scratch first.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
These sorts of problem is exactly why I started using zfs autobackup (from github) instead of the UI. So much easier for interruptions, no odd issues with encryption and roots, far more flexible, etc. When it takes 3 months to start replication over (which happened at least once a month using the UI, i.e., I could never get the whole thing transferred after I lucked out once), it just isn't worth it. For those with stellar internet you probably have not noticed!
 
Last edited:

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
I have stellar Internet of 5Gb on one side and 1Gb on the other, but transferring 7TiB to 20TiB is the issue. No interruptions can occur.
 
Top