Error migrating data to new pool

lightwave

Explorer
Joined
Jun 14, 2018
Messages
68
Hi,

I got an unexpected error when migrating to my new zpool:

Code:
root@nas[~]# zfs snapshot -r Tank@Transfer-20230225_1005
root@nas[~]# zfs send -R Tank@Transfer-20230225_1005 | pv | zfs recv -F Tank2
1.84TiB 3:58:11 [ 135MiB/s] [  <=>                                             ]
cannot mount 'Tank2': mountpoint or dataset is busy


Does anyone know what this means? It seems all data and all snapshots are now present on Tank2 - is there any way I can be sure that everything has been migrated properly?

Thank you in advance!
 
Joined
Oct 22, 2019
Messages
3,641
cannot mount 'Tank2': mountpoint or dataset is busy
When did you see this message? Soon into the replication, or only after it completed?

Next time, for such a migration, you can invoke the "-u" flag on the "zfs recv" side, so that it will not attempt to mount the datasets. Then you can proceed to export and re-import the "new" pool.


is there any way I can be sure that everything has been migrated properly?
You can compare the snapshot GUIDs between the old pool and new pool:
Code:
zfs list -r -t snap -o name,guid Tank | grep Transfer-20230225_1005

zfs list -r -t snap -o name,guid Tank2 | grep Transfer-20230225_1005


Snapshots cannot be "partial". If the GUIDs match, then you should be good to go. You can even compare their sizes. (A snapshot either exists or it doesn't. You can't have two identical snapshots that "differ" by 1% or something.)
 
Last edited:

lightwave

Explorer
Joined
Jun 14, 2018
Messages
68
Thank you! Sorry for late reply.

GUIDs do match so shoulnd't be a problem then

However, having juest sumbled upon a couple of articles about the ZFS send/recv hole_birth issue I anyhow decided to go "full paranoia" and do a complete compare on all my datasets using rsync (which confirmed that all was indeed OK)
 

lightwave

Explorer
Joined
Jun 14, 2018
Messages
68
For anyone else in "full paranoia" mode wanting to do a full file-system compare using rsync, this is what I used:

Code:
rsync -n -acvi --delete /mnt/{dataset A}/.zfs/snapshot/{snapshot}/ /mnt/{dataset B}/.zfs/snapshot/{snapshot}


(Appologies for stating the obvious for those of you who already speak fluent rsync. Gotta admit I had to read the man page a few moments before comming up with the above. Also feel free to point out anything I might have missed / overlooked using the above command.)
 

lightwave

Explorer
Joined
Jun 14, 2018
Messages
68
I'm still slightly worried something might have gone wrong transfering the snapshots.

With the rsync compare I feel fairly confident all my data not in snapshots has been moved properly.

However, when I look at the dataset sizes, I see some preculiar differences:
  • Why is used and usedbysnapshots so much larger in the old dataset (Tank/Data1) than in the new (Tank2/Data1)?
    • Compression ratio seems to be the same
    • I have not changed compression during the life of the dataset
    • Tank2 contains 3G more snapshot data than Tank (see below)
    • I know the used number can differ somewhat after send/recv
    • However, surprised (and sligthly worried) to see a 20% decrease
    • Anyone have an idea why this is?
  • Logicalused and logicalreferenced seems to be the same
    • Note: a number of backup snapshots corresponding to c. 3G (based on zfs list -t snapshot size information) have been droped in Tank but not in Tank2 which could explain that logicalused on Tank2 is 2G bigger than on Tank
    • Are these the numbers I should compare to feel confident everything is copied properly?

Old dataset

Code:
zfs get recordsize,usedbysnapshots,usedbydataset,usedbychildren,usedbyrefreservation,logicalused,logicalreferenced,used,compressratio,compression  Tank/Data1
NAME          PROPERTY              VALUE          SOURCE
Tank/Data1  recordsize            128K           default
Tank/Data1  usedbysnapshots       57.9G          -
Tank/Data1  usedbydataset         12.8G          -
Tank/Data1  usedbychildren        0              -
Tank/Data1  usedbyrefreservation  0              -
Tank/Data1  logicalused           46.3G          -
Tank/Data1  logicalreferenced     13.3G          -
Tank/Data1  used                  70.9G          -
Tank/Data1  compressratio         1.02x          -
Tank/Data1  compression           lz4            inherited from Tank


New dataset

Code:
zfs get recordsize,usedbysnapshots,usedbydataset,usedbychildren,usedbyrefreservation,logicalused,logicalreferenced,used,compressratio,compression  Tank2/Data1
NAME           PROPERTY              VALUE          SOURCE
Tank2/Data1  recordsize            128K           default
Tank2/Data1  usedbysnapshots       44.5G          -
Tank2/Data1  usedbydataset         12.7G          -
Tank2/Data1  usedbychildren        0              -
Tank2/Data1  usedbyrefreservation  0              -
Tank2/Data1  logicalused           48.4G          -
Tank2/Data1  logicalreferenced     13.3G          -
Tank2/Data1  used                  57.2G          -
Tank2/Data1  compressratio         1.02x          -
Tank2/Data1  compression           lz4            inherited from Tank2
 
Joined
Oct 22, 2019
Messages
3,641
Note: a number of backup snapshots corresponding to c. 3G (based on zfs list -t snapshot size information) have been droped in Tank but not in Tank2 which could explain that logicalused on Tank2 is 2G bigger than on Tank
That would be why.
 

lightwave

Explorer
Joined
Jun 14, 2018
Messages
68
Thank you for the quick reply.

With my limited understanding c. 3 gb more data on Tank2 would have explained Tank2 being larger than Tank. This is, however, not the case. Rather, Tank2 is 14 gb smaller than Tank (as reported by the "used" property).

I understand that the representation on disk may change with send/recv which could explain differences in used disk space for the same data. However, considering that compressratio is the same I am somewhat surpriced of the 20% decrease in used (not taking into account the c. 3 g more data - with that in mind it is probably closer to 23% decrease in space used)
 
Top