Copy and sync data, between zpools/volumes ?

Status
Not open for further replies.

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
With my previous NAS server (NAS4free) I could use rsync to locally sync or copy between my 2 volumes/zpools.
After reading the FreeNAS userguide and going through its GUI it seems not possible to do in FreeNAS :confused: ?
1) how to copy files between volumes with redundancy check ?
2) how can i keep two folder in two different volumes synced ?
I googled and searched through the forums but found no answers :(
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
rsync or zfs replication will do what you want. To setup rsync in the gui you will use the cron settings if you want it to run automatically.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
You can use replication. It takes care of CRC during copy process so it is 99.99999% reliable, if you have ECC memory of course.
Many ways exist in the replication process, but everything start with a snapshot.
You can read on FreeNAS 9.3 documentation with the link on your Freenas GUI on the top right where the Question mark icon is displayed.
You can use automatic replication or you can use the command line.

Let say you have a Pool_A and Pool_B on same machine, Pool_B can of course be over the network too on a different FreeNAS system:

Pool_A/Dataset_1
Pool_A/Dataset_2
Pool_A/Dataset_3/Dataset_3_A
Pool_A/Dataset_3/Dataset_3_B
Pool_A/Dataset_3/Dataset_3_C

Pool_B/Dataset_X

If you only need to replicate Pool_A/Dataset_3 and it's children dataset, you can take a recursive manual snapshot of Pool_A/Dataset_3. Let's call it manual-2015-02-09.

To start replication, enter the following command:

Code:
zfs send -vR Pool_A/Dataset_3@manual-2015-02-09 | zfs receive -vF Pool_B 


The command will replicate the dataset structure under Pool_A/Dataset_3 to Pool_B.
Any existing snapshots taken prior to this manual recursive one will also be replicated.

The advantage about the command line and the creation of manual snapshot is that even if you use automatic snapshot and replication, the existing manual snapshot are never going to expire, meaning that you can safely sync up to a device if it is not always available as in another server. This is good if you make replication to a hard drive for archiving purpose. Something the automatic replication cannot do correctly as it always looks at the latest snapshot, which could have expired depending on "Keep snapshot for" settings.

Once the replication is completed you will get an exact copy of the source.

If you want to update Pool_B on a regular basis, you then need to change the command to include the incremental portion as follow:
You need to create another manual recursive snapshot as above and give it the default date, or make it unique so that it doesn't match the original one. For instance, 2 weeks have passed and you want to replicate.
You take a new recursive sanpshot of Pool_A/Dataset_3 and save it under manual-2015-02-25.

The command becomes:

Code:
zfs send -vRI Pool_A/Dataset_3@manual-2015-02-09 Pool_A/Dataset_3@manual-2015-02-25 | zfs receive -v Pool_B


The zfs send option -R and -I are case sensitive and will not have the same effect as -r and -i.
The former allow you replicate everything and never remove stale snapshots on the destination (Pull in Freenas terminology).

If you do plan on backing up onto a backup drive, then you must set the Readonly option on the drive hosting Pool_B. This has to be done once before the initial replication otherwise you may have some errors during replication update regarding state of Pool_B.

Code:
zfs set readonly=on Pool_B


Of course if Pool_B is always connected, then best to use Freenas GUI automatic replication.

Let me know if you have more specific questions.
 
Last edited:

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
I am a bit confused about configuring RSYNC locally. I have gone through the rync part of the manual. My both pools are on the same server.
I got 2 tasks, push and pull, and 1 module. See pics.
http://postimg.org/image/axv8ibwjj/
http://postimg.org/image/ubck9oi6l/

Rsync_Tasks.jpg


Rsync_Module.jpg


Nothing is hapening :confused:


And yes, I use ECC memory and ECC compatible CPU. Which I hope benefits rysnc and not only Replication ?
Regarding Replication, I think I will need to dig more into it later. But for now I have to go with the easier to use option because I am in hurry to copy/sync and new to freeNAS.
 
Last edited:

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
I think replication is much more reliable and quicker too.
With Rsync, if you want proper synchronization, you need to do CRC comparison from source and then destination, then start transfer.
With replication, either the replication succeeds, or it fails. When it fails (ie: loss of connection between networked machines), the failed snapshots just need to be sent incrementally. At the end, even with interrupted replication, what ever snapshots are on the pull is an exact copy of the push.
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
Ok, you convinced me :)
looking into replication. As far as i understand it is on datasetlevel. So no use if i want to copy specific folder and not all folders, specially if the destination has limited space. correct ?
But for copying everything, in my case, this is a working solution with a onetime snapshot.
Regarding rsyncs, doesnt it have CRC check and is as secure as replication when for moving my data ?
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
Ok, you convinced me :)
looking into replication. As far as i understand it is on datasetlevel. So no use if i want to copy specific folder and not all folders, specially if the destination has limited space. correct ?
Correct, however, and it could make things over complicated, but you could create a dataset locally and do a copy via ssh "cp folder to_dataset".
This way you take a snapshot of that folder and you replicate it to the destination.
Ideally, you would have to create an everyday dataset where you would do your everyday work, and replicate that one anytime you feel like it.
You can create dataset, within dataset similar to a regular folder structure and have the top level dataset being the share. You can still descend into the different dataset exactly like regular folders. And of course you can replicate the dataset you really need or simply do a recursive snapshot and just replicate the snapshot of interest.

But for copying everything, in my case, this is a working solution with a onetime snapshot.
Exactly and you can do incremental snapshot after that. A bit tricky to get all the steps right but when it work it works great.

Regarding rsyncs, doesnt it have CRC check and is as secure as replication when for moving my data ?
Not quite. ZFS will take care of validating the raw data, and RSYNC will use CRC over the network to ensure integrity of the packets. This part is fine, however, if your transfer is interrupted, you will not know for certain which files have been copied over and if the files are themselves intact. I think RSsync, could possibly create a file of the specific size and populate its contents as it goes along, this part I am not sure though. But at the end of the day, you need to know all the files you have transmitted are in fact what you expect. In the case of a transfer failure, you will have to have both transmit and receive side run a CRC comparison (if you are paranoid, which I am, I think) so that any incomplete file (could be of the same size) would require the file to be send over. This task require that both transmit and receive perform local CRC, unless you run binary comparison, then why not sending the file all over again. Thus, the transmit side will have to check already existing files (could be huge amount of data, including files that have been Rsync days, weeks of month prior to that(something snapshot handle very well)). Rsync will be required to check every single files on thransmit and receive.
For example, I am building my DVD media library. I buy 5 DVD per week that I will backup and save on the folder. This is like 9GB per disk so total 45GB. Every week I want to Rsync my folder on an external drive (doesn't care of the type), I will have to check the 45GB of the first week, then do that on the following week that will now take an extra 45GB, so 90GB. Repeat this step for an entire year and you will have to Rsync 2.4TB. Definitely a few hours worth. If you have a Rsync errors (ie, discrepancy on the CRC during checking of both transmit and receive) You will have to make a choice which source contains the correct data. Hopefully you will have ZFS on both ends, so technically the only time you will get corruption would be from a failed transfer to the destination. If you are indeed Rsynching to another ZFS system and are running automatic snapshot, then you will have to worry about your pool capacity due to the nature on the Write on Once. If you don't use snapshot, this is not so much of an issue, otherwise your pool's capacity will shrink quicker than you think.

In conclusion, I would recommend you work a way around so that you can use snapshots and replication. Replication will only send what ever data has been added between the previous snapshot and the last. So if we were to apply the 5DVD a week scenario with replication, you will constantly send 45GB of data to your destination drive. Regular scrubs of the source and destination will guarantee your data are not corrupted (silent corruption). If replication fails, only the unfinished snapshot is discarded, so whatever snapshot already transmitted are final. You don't need to second guess whether the snapshot is complete or not as it will always be complete if present on the destination. For that you can run "zfs list -t snapshot" and check the snapshots are there.

Sorry for the novel, but I think it is important the distinction between RSYNC and snapshot and replication.
 
Last edited:

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
no, thank you for the comment :) I am new to freenas and love to read and learn as much as possible. so i really appreciate your posts.
And me beeing new is a little problem, i dont know any commands. I hardly graped the whole GUI and not sure where to learn all the commands to do whatever is not possible in the GUI like "cp folder to_dataset"
and you are correct, I am paranoid. that is the only reason i am here and using a zfs based server for storage :)

Anyway, it seems replication is superior compared to rsync.
I only need to conduct a small test of replication before i can go big with all my data. i need to know that it works and i can find my way around the gui or will i need commands to once i want to create my destinationcopy or whatever its called in replicationcontext :) ?
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
no, thank you for the comment :) I am new to freenas and love to read and learn as much as possible. so i really appreciate your posts.
And me beeing new is a little problem, i dont know any commands. I hardly graped the whole GUI and not sure where to learn all the commands to do whatever is not possible in the GUI like "cp folder to_dataset"
and you are correct, I am paranoid. that is the only reason i am here and using a zfs based server for storage :)
Unix like commands are usually not straightforward.
One main element to keep in mind is that it is case sensitive. Very important for options as the same letter can be both lower case and upper case, and yet perform differently.
I find the lack of "man", which stand for "manual" a downer in FreeNAS. You need to look up the details online or keep notes.


Anyway, it seems replication is superior compared to rsync.
I only need to conduct a small test of replication before i can go big with all my data. i need to know that it works and i can find my way around the gui or will i need commands to once i want to create my destinationcopy or whatever its called in replicationcontext :) ?
Always experiment, but it is not always without risks.
With the ZFS send | ZFS recv comands, some options can actually destroy the data on the destination/Backup drive, so you must be extreemely careful.

If you have a volume structure you have already implemented and one you have in mind for the backup, I can write the command for you and give you a rundown of what it does.
There is a forum I described the details of the process, but it is better understood when you can correlate with identical setup.
If you have a decent PC, you can install Freenas as a virtual machine under Virtual Box and emulate the creation of a virtual drive. You can then create a few dataset, create folder and copy some files and create a second dataset as the backup still within Virtual Box.
It is definitely the safest, but not necessarelly the easiest.

To get information on ZFS, I usually search online and I find the Oracle website very informative.
Also, you want to get accustomed to SSH with Putty, this way you don't need to access the command window within Freenas Web GUI which is not practical at all.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
"I find the lack of "man", which stand for "manual" a downer in FreeNAS." Aww man, I think alike. Why they have removed the man? it's such a great feature, just why?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
"I find the lack of "man", which stand for "manual" a downer in FreeNAS." Aww man, I think alike. Why they have removed the man? it's such a great feature, just why?
Space limitations prior to FreeNAS 9.3, I hear.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
That's exactly what it was. Since end-users shouldn't, as a course of business, be using the CLI, there was no reason to have the manpages.

I did talk to the CTO of iXsystems and he said they have no intention of adding the manpages, so don't bother asking. ;)
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
I tried out replication of a snapshot. worked fine with your command.
So i decided to go with onetime snapshot and replication to move my 5 TB from the first pool to the second one on the same machine.
But how can I see the progress of the replication???
As soon as I closed the shellwindow where i typed the replicationommand the progressinfo dissapeared and no further info on statues. Tried to search for some kind of statuescommand to paste into shellwindow to see how far the replication is going or if it is finished without issues ?

In the future I will use periodic snapshot on some of my documents and skipp my manual periodic backup :)
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
I tried out replication of a snapshot. worked fine with your command.
So i decided to go with onetime snapshot and replication to move my 5 TB from the first pool to the second one on the same machine.
But how can I see the progress of the replication???
You need to have the -v option on the send and/or receive side of the command.
It will indicate how much data as been sent.
If you looks at the first few lines being displayed, it will summarize how much data is supposed to be sent out.

As soon as I closed the shellwindow where i typed the replicationommand the progressinfo dissapeared and no further info on statues. Tried to search for some kind of statuescommand to paste into shellwindow to see how far the replication is going or if it is finished without issues ?
You need to run the command using tmux, as the command run within the shell is terminated upon closing it.

Code:
tmux

Then you will be able to run the command for zfs send and zfs receive, same as you did in the shell.
It is best to use ssh through Putty for this kind of work while still runnig the command under tmux.
Only then will you be able to exit the shell and open a new ssh session via Putty and use the tmux attach command:

Code:
tmux attach


You can monitor transfer is under way by looking at the Freeneas GUI report on disks and partitions. They will show activities.
You can also run the gstat command to look at realtime drive activity (throughput).

In the future I will use periodic snapshot on some of my documents and skip my manual periodic backup :)
Periodic snapshot will work, but I don't know if automatic replication will take on the last snapshot from the backup drive. It may complain about existing dataset on the backup.
 

MaIakai

Dabbler
Joined
Jan 24, 2013
Messages
25
So I'm planning on redoing my main pool (Home) with new 6TB Drives.

I've created a new raid-Z x 8 x 2TB Drive volume and named it Backups using a JBOD (ext sas connection) Main pool is 10.7TB not accounting for compression. Backup Pools is empty at 12.7TB

Looking around I saw that zfs replication is a better option for copying data form one pool to another compared to rsync.

I created a manual recursive snapshot of Home volume and Named it Backup-DATE
Then I used this command :

zfs send -vR Home/Home@Backup-2015-10-19 | zfs receive -vF Backup/Backup


And it finished....Only now my pools are different sizes. Seems like I nested the backup Volume a bit too much. There is a compression ratio difference of .01x which is odd. But I'm seeing a 1TB difference between them.

upload_2015-10-20_21-25-0.png


Anything else I can do to verify things are correct before I blow away the Home volume? I did notice that the Backup volume holds more than 2x the amount of files as Home. 780k Files vs 350k. Which again doesn't make sense seeing as Backup is smaller. Nothing as intensive as md5ing every file/folder.

I did notice that during the zfs send/receive I lost free space in the Home pool. Why? Could this 1TB difference be that? (I had almost 5TB free in that pool prior to doing all of this)

Most of my critical data is already backed up offsite (1.2TB)
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
If you have identical snapshots on both send and receive then you have the exact copy.
Do the following:

Code:
zfs list -t snapshot -o name -r Home >  ~/snap-Home.txt
zfs list -t snapshot -o name -r Backup >  ~/snap-Backup.txt

Edit:
snap-Home.txt and remove "Home" by replacing the "Home" with "" inside file.
Repeat the process with snap-Backup.txt and replace "Backup" with "" inside file.

Do file comaprison and if the number of snapshot is the same and the name of the snapshot are also the same then you have make proper replication and everything you had on Home is now on Backup.
Compression will be the culpri but it is good in this case.
You must understand that replication is not a block copy as would HDD sector imaging. Instead, the data is retargetted to fit on the new dataset and you must have chosen a different compression as the original dataset.
As long as snapshots are identical everything is fine.
 

MaIakai

Dabbler
Joined
Jan 24, 2013
Messages
25
So.... I redid my backup volume and ran it again.

For some reason upon rebooting Backup showed a failed drive that didn't really exist. This forced that volume offline and I couldn't do anything with it. Looking at it, It said I had 9 drives instead of 8, no clue where it pulled that extra drive.

That volume was originally created on a Nas4free system and imported in.

So. Created new Backup within FreeNAS, same settings as Home volume. Ran zfs send | receive again.

Everything copied over beautifully. 315,033 files vs 315,032 files, all other tests come back valid.

Detached Home, Created a new Volume Named Home with new 6TB drives, ZFS send receive from backup to Home, Detach Backup.

File count the same. The only thing I notice that's different is size on disk a bit more but meh, I'm happy for now. Checked ashift, both old and new are 12 so it's not a 4k Alignment issue.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
File count alone is not an indication of proper backup. Snapshot holds deleted files and they will not show in the file list, however, when you recover the particular snapshot or clone it, then the files that were deleted after the snapshot will be recoverable.
This is why you need to compare dataset snapshots and dates.
 
Status
Not open for further replies.
Top