Does replication save snapshots or directory tree?

Doogie · Dec 30, 2016

I have three 80 TB FreeNAS 9.2.4 servers named "Server", "Backup1", and "Backup2". Server contains a single dataset, "Volume". Server is in use 24x7, and Backup1 and Backup2 are kept mainly offline and serve as rsync repositories for Server. Periodically, I run rsync to synchronize Server->Backup1 and Server->Backup2 over 1G ethernet. I require Backup1/2 to have an identical directory tree to Server, such that if Server goes up in smoke, I could put either Backup1/2 online and have access to all files immediately. Rsync does this, and this setup has served my needs well for many years.

I am in the process of upgrading to 9.10.2. I have read that replication is faster, and the preferred solution to the above scenario. I would like to try using replication. My question is this: if I set up Server (=alpha, following online nomenclature) to share Volume with Backup1/2, is the directory tree replicated under Backup1/2 (=beta) Volume, or are blobs of data representing snapshots stored on Backup1/2?

My confusion stems from my lack of knowledge of snapshots. Although these are a basic aspect of ZFS, I have not found a need to employ them to date and have no experience in their use. Are snapshots binary blobs like diff files for volumes which are solely used to reconstruct data, or are snapshots something else? I do not wish to set up replication if Backup1/2 have their volumes full of files like "snapshot20161229-2019.bin", and a failure of Server requires me to reconstruct data from these files. I need duplicate directory trees.

Any help from an experienced user would be greatly appreciated. Thank you.

Jeff Arnholt

MatthewSteinhoff · Dec 30, 2016

Snapshots and replication are so much better than rsync.

Yes, the file structure on the target nodes will look identical to the file structure on the source nodes (assuming you name everything the same). No extraction, reconstruction or post-processing is required if you want the most recent version of the files.

If you want an earlier version, it is still quick; under a minute to spin off a large snapshot into live files.

With snapshot replication, you may even want to keep the backup nodes up and running. Automated replication at time intervals you choose and with the number of versions you want is nearly effortless.

The first time you replicate a pool, it'll take a long time because it has to send the entire data set. After that, it only needs to send the changes. Unlike rsync which has to do a lot of scanning to see which files have changed, FreeNAS already knows so the snapshot and send are wicked fast.

Create a small pool on your primary server. It need not be more than a hundred meg and a few hundred files. Then replicate that pool to one of your backup servers. Once you see how it works and are comfortable with the process, make it live.

Cheers,
Matt

Doogie · Dec 30, 2016

Matt,

Thanks for the fast response! I've made some progress, but still do not have replication set up (after four hours of effort). I may know why--let me run this by forum members:

I want to keep the logical structure of my server as simple as possible. Only I will use it, and I have no need to subdivide my single volume (disk pool) into multiple datasets with different permissions, compression, etc. To that end, my server has a single volume, but NO datasets. I am able to successfully share the volume through SMB or SFTP and can write files, etc. with no apparent need for datasets.

Question: in order to use replication, am I *REQUIRED* to create a dataset under this volume, or can I replicate a volume without creating a dataset? Do the algorithms ZFS uses for replication require a dataset, or can I replicate my existing volume? Ideally, I would like to replicate one volume to another without datasets.

As part of my troubleshooting, my SSH keys are accepted, so security doesn't seem to be the issue. When I run the troubleshooting command in the manual (with "Volume" the same name of the both the Server and Backup1 pools):

zfs send Volume@auto-20161230.2200-2w | ssh -i /data/ssh/replication 192.168.1.11 zfs receive Volume@auto-20161230.2200-2w

I get the error message:

Could not create directory '/root/.ssh'
cannot receive new filesystem stream: destination 'Volume' exists
must specify -F to overwrite it
warning: cannot send 'Volume@auto-20161230.2200-2w': Broken pipe

If I try the -F flag:

zfs send Volume@auto-20161230.2200-2w | ssh -i /data/ssh/replication 192.168.1.11 zfs receive -F Volume@auto-20161230.2200-2w

I get:

Could not create directory '/root/.ssh'
cannot unmount '/var/db/system': Device busy
warning: cannot send 'Volume@auto-20161230.2200-2w': Broken pipe

I'm stuck but don't want to go back to rsync. Any thoughts or help appreciated!

Jeff Arnholt

mav@ · Jan 1, 2017

ZFS send/receive replicates whole dataset or hierarchy of datasets. It means it should create the dataset from scratch on receive side and that side should not even try to modify it. Since you are trying to replicate root dataset of your pool, I suppose it causes problem with system dataset stored on the same pool on receive side, which can not be destroyed by using -F flag (and should not be).

Doogie · Jan 1, 2017

Well, my experience with replications has not been good--and not for the lack of effort (24+ hours), review of documentation, and the help of this community (thank you!). This was my experience:

Proceeding from the discussion above, I ended up creating a dataset on Backup1 and then ran:

zfs send Volume@auto-20161230.2200-2w | ssh -i /data/ssh/replication 192.168.1.11 zfs receive -F Volume/Dataset@auto-20161230.2200-2w

Unlike before (without datasets), this immediately created a SSH process on both Server and Backup1 and started consuming space on Backup1 at a rate of about 1 TB every 3 hrs. CPU utilization on Server was near 100%, and about 50% on Backup1. Success! Well, unfortunately, not.

This continued to run for about 26 hours. Sometime overnight, after perhaps 9 TB of network traffic, the process crapped out. I had to log-in again to Backup1. When I did, all of the space on Backup1 which had been consumed was again available. It was as if I had never initiated a replication.

I since went back to Rsync, which at least if interrupted will continue where it left off. It seems slow, but my 3+ years of experience with it under FreeNAS has been bulletproof--and when it comes to irreplaceable data, is priceless.

I may experiment with replication again down the road, once I build two test servers which I can use for learning purposes.

Would anyone with experience care to chime in--has anyone tried to replicate a massive dataset (in my case, 30 TB), had some type of hiccup or interruption, and seen this type of behavior? Are replications truly an all-or-none process, where an interruption during replication loses all progress which can not be restarted?

Thanks in advance for any input.

Jeff Arnholt

SweetAndLow · Jan 1, 2017

Hardware specs?

ZFS replication is an all or nothing process. On a LAN that doesn't matter to much because the network should be plenty stable. But there can also be a good argument for having multiple datasets just for this reason. You could transfer each individually and not have to start over if something gets messed up. I transferred 20TB through replication and it maxed out my gigabit connection the entire time with no complications.

Having a single big pool is kind of a mistake on your part and goes against best practices.

Replication is very easy to use and the problems you are having are related to things outside of just replication and snapshots.

Sent from my Nexus 5X using Tapatalk

Doogie · Jan 1, 2017

Hardware specs (identical for all 3 FreeNAS systems) are beefy i5's, 1 gigabit standard Cat6 (short runs), 32 GB memory per server, and 20 drives, 10x8TB and 10x4TB, with each 10 drive pool set up as a raidz2.

Multiple datasets--food for thought. It solves the problem, but through complexity. If all I need is one massive volume to store videos, with identical permissions, compression, etc., it is conceptually much easier to deal with one vs. multiple datasets. I'd hate to think about having to set up 10 different replications, 10 different snapshot regimens, etc. for 10 different datasets, vs. one big pool covered by one rsync process. In a similar vein, I'd much prefer having one big 8 TB drive in my desktop PC vs 8 1 TB drives humming along--assuming proper data backups are in place. However, I imagine this is a topic of philosophical debate with no right answer. As I've mentioned above, Rsync set up in the above fashion has been bulletproof for years, and honestly, was far, far easier to set up and maintain--I just wanted to try replication for the benefits Matt mentioned above.

I will give this a shot again down the road using servers and data which aren't mission critical.

Happy New Year.

Jeff Arnholt

Scharbag · Jan 2, 2017

MatthewSteinhoff said:
Snapshots and replication are so much better than rsync.

That is a fairly wide brush you are painting with!! :)

Both replication and rsync have their place. It really depends on what you are trying to accomplish. There are cases where replication is the best tool, and places where rsync is the best tool. Like hammers and screwdrivers.

I personally use replication, rsync and snapshots as part of my backup strategy. It has served me well for years. One nice thing about rsync is that if it gets interrupted, it does NOT need to start over from the beginning. I have moved many many TB of data over rsync and it will also saturate a GB link for the duration. Another great thing about rsync is that it works on pretty much any *nix system (I have even used it on WinBlows), allowing great flexibility regarding the backup target.

Cheers,

Arwen · Jan 3, 2017

Scharbag said:
...
One nice thing about rsync is that if it gets interrupted, it does NOT need to start over from the beginning.
...

I've been reading about OpenZFS and it's newer features. Resumable ZFS send & receive seems to be
close to release. It will likely take 6 months or more before we see it on FreeNAS. But that is all to the
good, as by the time we get it, it should be both stable and well documented.

anodos · Jan 3, 2017

Scharbag said:
There are cases where replication is the best tool, and places where rsync is the best tool. Like hammers and screwdrivers.

You obviously need a bigger hammer.

Scharbag · Jan 3, 2017

anodos said:
You obviously need a bigger hammer.

:D:)

Story of my life...

RlainTheFirst · Oct 5, 2017

anodos said:
You obviously need a bigger hammer.

Hahahaha!
I did not expect that, thanks for a good laugh!

Also, did I understand this correctly:

SweetAndLow said:
ZFS replication is an all or nothing process.

As: Replication takes all snapshots and replicates everything exactly as the source dataset (possible recursively) onto the target destination, no choice in which snapshots etc?

I am planning to create an offsite backup of my home server with freenas onto a remote freenas server, and this is the behavior I want, but I am also cursion to as if this is the only way.

mav@ · Oct 5, 2017

ZFS allows to replicate any snapshot as diff on top of any previous. The only requirement is that for the receiving side the process should be sequential without gaps. But it is a manual process, which I don't think is supported by FreeNAS UI.

Important Announcement for the TrueNAS Community.

Does replication save snapshots or directory tree?

Doogie

Dabbler

MatthewSteinhoff

Guru

Doogie

Dabbler

mav@

iXsystems

Doogie

Dabbler

SweetAndLow

Sweet'NASty

Doogie

Dabbler

Scharbag

Guru

Arwen

MVP

anodos

Sambassador

Scharbag

Guru

RlainTheFirst

Dabbler

mav@

iXsystems

Similar threads