SOLVED Replication failed while sending incremental snapshot

IQless

Contributor
Joined
Feb 13, 2017
Messages
142
Hi,
I'm having some problems with my replication job.
I keep receiving email notifications that state:
"Hello,
The replication failed for the local ZFS Tank1/Backups while attempting to
apply incremental send of snapshot auto-20180730.2050-2w -> auto-20180731.0850-2w to 10.13.37.10"

This is a replication from the FreeNAS VM to the Backup System.
upload_2018-8-5_16-35-5.png


Hardware:
Main System (VMWare ESXi)
Supermicro X8DTL-iF
2x Intel Xeon E5645 @2.4GHz
6x16GB (96GB) Samsung Registered ECC 1.35V
Fractal Design XL R2
Fractal Design R2 650W Gold
LSI 9211-i8 IT-mode (passthrough to FreeNAS VM)
Boot: 1x 120GB Kingston SSDNow V300 120GB

FreeNAS VM:
FreeNAS-11.1-U5
CPU: 8vCPU
RAM: 32GB
RaidZ2: (4xWD Red 4TB + 2x Seagate Ironwolf 4TB)

Backup System:
FreeNAS-11.1-U5
MB: ASUS P7P55D-E, Socket-1156
CPU: Intel® Core™ i7-860 Quad 2.8GHz
RAM: 16GB (4x Kingston ValueR. DDR3 1333MHz 4GB)
NIC: Dual Intel® 82576 Gigabit Ethernet
Case: Cooler Master CM 690 II Advanced Black
PSU: Silver Power SP-SS850 850W PSU
HDD Dock: ICY BOX IB-555SSK 5Bay Backplane
Boot: Hitachi Travelstar 7K500 500GB 2,5" Laptop HDD
Pool: RaidZ1 (4x 2TB Seagate Barracuda 7200.14, 1x 3TB Seagate Barracuda 3.5)


There seem to be some snapshots missing on the backup system
Code:
root@redqueen:~ # zfs list -t snapshot | grep Backups
Tank1/Backups@auto-20180722.2050-2w													45.7G	  -   752G  -
Tank1/Backups@auto-20180723.0850-2w													14.5M	  -   753G  -
Tank1/Backups@auto-20180723.2050-2w													14.5M	  -   753G  -
Tank1/Backups@auto-20180724.0850-2w													13.4M	  -  1004G  -
Tank1/Backups@auto-20180724.2050-2w													14.9M	  -  1004G  -
Tank1/Backups@auto-20180725.0850-2w													13.5M	  -   972G  -
Tank1/Backups@auto-20180725.2050-2w													14.5M	  -   972G  -
Tank1/Backups@auto-20180726.0850-2w													14.5M	  -   941G  -
Tank1/Backups@auto-20180726.2050-2w													13.1M	  -   941G  -
Tank1/Backups@auto-20180727.0850-2w													 400K	  -   932G  -
Tank1/Backups@auto-20180727.2050-2w													 416K	  -   932G  -
Tank1/Backups@auto-20180728.0850-2w													 607K	  -   670G  -
Tank1/Backups@auto-20180728.2050-2w													12.7M	  -   679G  -
Tank1/Backups@auto-20180729.0850-2w														0	  -   690G  -
Tank1/Backups@auto-20180729.2050-2w														0	  -   690G  -
Tank1/Backups@auto-20180730.0850-2w													14.8M	  -   713G  -
Tank1/Backups@auto-20180730.2050-2w													12.5M	  -   713G  -
Tank1/Backups@auto-20180731.0850-2w														0	  -   978G  -
Tank1/Backups@auto-20180731.2050-2w														0	  -   978G  -
Tank1/Backups@auto-20180801.0850-2w														0	  -   952G  -
Tank1/Backups@auto-20180801.2050-2w														0	  -   952G  -
Tank1/Backups@auto-20180802.0850-2w														0	  -   927G  -
Tank1/Backups@auto-20180802.2050-2w														0	  -   927G  -
Tank1/Backups@auto-20180803.0850-2w													 480K	  -   902G  -
Tank1/Backups@auto-20180803.2050-2w													10.9M	  -   902G  -
Tank1/Backups@auto-20180804.0850-2w													9.35M	  -   624G  -
Tank1/Backups@auto-20180804.2050-2w													14.2M	  -   632G  -
Tank1/Backups@auto-20180805.0850-2w														0	  -   634G  -

Code:
root@greenqueen:~ # zfs list -t snapshot | grep Backups
tank/redqueenRep/Backups@auto-20180715.2050-2w													34.8G	  -   761G  -
tank/redqueenRep/Backups@auto-20180716.0850-2w													9.99M	  -   778G  -
tank/redqueenRep/Backups@auto-20180716.2050-2w													9.57M	  -   778G  -
tank/redqueenRep/Backups@auto-20180717.0850-2w													9.46M	  -  1.01T  -
tank/redqueenRep/Backups@auto-20180717.2050-2w													21.0M	  -  1.01T  -
tank/redqueenRep/Backups@auto-20180718.0850-2w													8.64M	  -  1.00T  -
tank/redqueenRep/Backups@auto-20180718.2050-2w													7.88M	  -  1.00T  -
tank/redqueenRep/Backups@auto-20180719.0850-2w													12.9M	  -  1023G  -
tank/redqueenRep/Backups@auto-20180719.2050-2w													6.90M	  -  1023G  -
tank/redqueenRep/Backups@auto-20180720.0850-2w													 422K	  -  1017G  -
tank/redqueenRep/Backups@auto-20180720.2050-2w													 422K	  -  1017G  -
tank/redqueenRep/Backups@auto-20180721.0850-2w													10.7M	  -   751G  -
tank/redqueenRep/Backups@auto-20180721.2050-2w													12.3M	  -   751G  -
tank/redqueenRep/Backups@auto-20180722.0850-2w													11.3M	  -   751G  -
tank/redqueenRep/Backups@auto-20180722.2050-2w													11.5M	  -   751G  -
tank/redqueenRep/Backups@auto-20180723.0850-2w													11.8M	  -   751G  -
tank/redqueenRep/Backups@auto-20180723.2050-2w													11.9M	  -   751G  -
tank/redqueenRep/Backups@auto-20180724.0850-2w													10.9M	  -  1002G  -
tank/redqueenRep/Backups@auto-20180724.2050-2w													12.1M	  -  1003G  -
tank/redqueenRep/Backups@auto-20180725.0850-2w													11.0M	  -   970G  -
tank/redqueenRep/Backups@auto-20180725.2050-2w													11.7M	  -   970G  -
tank/redqueenRep/Backups@auto-20180726.0850-2w													11.8M	  -   939G  -
tank/redqueenRep/Backups@auto-20180726.2050-2w													10.7M	  -   939G  -
tank/redqueenRep/Backups@auto-20180727.0850-2w													 358K	  -   931G  -
tank/redqueenRep/Backups@auto-20180727.2050-2w													 371K	  -   931G  -
tank/redqueenRep/Backups@auto-20180728.0850-2w													 562K	  -   669G  -
tank/redqueenRep/Backups@auto-20180728.2050-2w													10.4M	  -   678G  -
tank/redqueenRep/Backups@auto-20180729.0850-2w														0	  -   688G  -
tank/redqueenRep/Backups@auto-20180729.2050-2w														0	  -   688G  -
tank/redqueenRep/Backups@auto-20180730.0850-2w													12.1M	  -   712G  -
tank/redqueenRep/Backups@auto-20180730.2050-2w														0	  -   712G  -


I'm not entirely sure how to fix this problem,considering the total size of the pool is 6TiB i hope I won't be needing to do a full replication job again..
Any suggestions?
 
Last edited:

M H

Explorer
Joined
Sep 16, 2013
Messages
98
Clear snapshots to and replicate back to the first snapshot that is identical on both the primary and backup machines and go from there.

On backup machine, zfs destroy -rv POOL/dataset@firstsnap%lastsnap where first snap where all previous snaps are identical on both machines. Add "n" option for dry run to make sure the right snapshots are being removed (zfs destory -rvn). That should save you from having to resend the full 6TB. You have to make sure both sides are identical to a common point or it will continue to fail.
 
Last edited:

IQless

Contributor
Joined
Feb 13, 2017
Messages
142
Clear snapshots to and replicate back to the first snapshot that is identical on both the primary and backup machines and go from there.

On backup machine, zfs destroy -rv POOL/dataset@firstsnap%lastsnap where first snap where all previous snaps are identical on both machines. Add "n" option for dry run to make sure the right snapshots are being removed (zfs destory -rvn). That should save you from having to resend the full 6TB. You have to make sure both sides are identical to a common point or it will continue to fail.

Thanks, I will try this after work. Let's hope it works.

Alternatively, I will just have to resend the full pool :S
 

IQless

Contributor
Joined
Feb 13, 2017
Messages
142
Update: It solved the problem, thanks!
 

appliance

Explorer
Joined
Nov 6, 2019
Messages
96
Clear snapshots to and replicate back to the first snapshot that is identical on both the primary and backup machines and go from there.

On backup machine, zfs destroy -rv POOL/dataset@firstsnap%lastsnap where first snap where all previous snaps are identical on both machines. Add "n" option for dry run to make sure the right snapshots are being removed (zfs destory -rvn). That should save you from having to resend the full 6TB. You have to make sure both sides are identical to a common point or it will continue to fail.
why both sides should be identical? source system can have retention for shorter time than target system. i do have more snapshots on the target system just like in the table above, for all datasets, and that makes it cool. i set snapshot lifetime 1 week on source system, and 1 month on target system, for example. however, occasionally i get this error, in case, where there's no activity on source system therefore no snapshots are present. i'd like to get rid of this bug as it stops other datasets from replicating.
 
Top