Freenas GUI Replication rollover strategy question and advice...

Apollo · Oct 6, 2015

Since upgrading to FreeNAS-9.3-STABLE-201509282017, I have been looking at the newly improved handling of Freenas GUI snapshot and replication strategy.
The newly submitted post on the Freenas website:

http://www.freenas.org/whats-new/2015/09/freenas-worst-practices.html

makes mention of the detrimental effect of snapshot collected over a long period of time.

Since updating to Freenas 9.3.1 and LSI driver P20, and while at the same time adjusting some of my routine automatic snapshot, I have noticed scrubbing time went from about 14hr-17hr prior to 9.3.1 to an excess of 38hr. Quite a significant stepback. I suspect the snapshot maybe part of the cause.

Anyway, what I want to explore is the feasability and reliability of the following snapshot strategy:

Lets' say I want to replicate a dataset that has many dataset under it.

- The idea is to keep 5 minutes snapshot interval on some or all of the dataset over a 2 weeks period and make it recursive.
- I want to setup a recursive snapshot of the top dataset every one day and keep it for 2 weeks.
- I want to setup a recursive snapshot of the top dataset every 1 week and keep it for 1 month.
- I want to setup a recursive snapshot of the top dataset every 2 week and keep it for 6 month.
- I want to setup a recursive snapshot of the top dataset every one 1 month and keep it for 1Million years.

Setting up automatic replication to localhost backup drive seems to work, and should work for remote replication as well. Also, multiple replication of the same dataset can be performed to multiple remote volumes, so this makes it sweet to handle.

The question I am cautious about, is whether this solution the better approach and whether or not it is the most reliable.

On my current Freenas setup, I have multiple datasets and they all have different snapshot strategies. Some requires reduced snapshot intervales, but at the end I want to maximize retention. Some dataset I do not access on a regular basis but when I do I would like to have snapshots taken soon after I make changes to the content of the dataset.
In a nutshell, it is not easy to setup adequate snapshot policy while retaining just a handful of snapshots.
The idea of the snapshot is to freeze the data within the dataset at the time the snapshot was taken, meaning that deleting the file of altering it in any way will only be done only on the file after the last snapshot.
If I want to recover an earlier version of that file then I just need to check difference between snapshots and located which of the snapshot holds the previously modified file. I then clone the snapshot recover the file and voila.
If I were to take a snapshot every 5 minutes on a snapshot life expectancy of 1 year, this is going to be close to
105120 snapshot for a single dataset.
If I take a recursive snapshot of my entire volume, then this is going to become unmanageable. Beside it doesn't make sense to keep a one year expectancy snapshot with that high granularity, at least not for me.

I have started evaluating this process at a small scale, but I do not have much data to work on and I do not want to span over a large period of time and find myself push in a corner because I would have overlooked one aspect.
As far as I can tell. this approach seems to be doing what I am trying to accomplish while keeping a more moderate number of snapshots.

Any thought would be appreciated.

dlavigne · Oct 13, 2015

Did you decide upon a solution?

Apollo · Oct 13, 2015

dlavigne said:
Did you decide upon a solution?

I am currently running a series of test using Freenas GUI replication and I do get some errors with specific snapshots, but is seems the snapshots failing are the one that have expired and been destroyed.
This is causing replication status to always show as failed.

I have some issues with Freenas GUI replication when it comes to aborting the operation. The only way I can stop it is by disabling the replication and wait for the one in progress. If my backup drive is experiencing IO failure (forced removal, or drive failure), SMB becomes disabled, and Freenas needs to be restarted.
Concurrent replication is not possible.

More test are needed, and I was hoping to get some answers from the community in providing some feedback from real case scenario.

Also, what is the automatic replication mechanism?

toadman · Oct 13, 2015

As an alternative I've been using the script detailed here for a long while. Works perfectly. The script will let you trim your snapshots on any schedule you want. Replication then does the rest.

https://forums.freenas.org/index.ph...napshots-similar-to-apples-timemachine.10304/

I've got 4 main datasets. I schedule a snapshot 1x/day. Then prune according to need. Some only need 2 weeks worth. Others I keep a week worth, then 4 weeklies, then 3 monthlies, then 1 year. I kill everything older than a year. I setup the snapshots via the gui at 1x/day and lasting for 2 years. i.e. the script manages all the deletions.

I did this originally because multiple snapshot intervals/expiration were not supported on the same dataset. (Or at least there were some bugs floating around.) This script gets (got) around that limitation/issue.

The downside is it only manages to the hour, not the minute. You'd have to modify the script to do the every 5 minutes part.

Important Announcement for the TrueNAS Community.

Freenas GUI Replication rollover strategy question and advice...

Apollo

Wizard

dlavigne

Guest

Apollo

Wizard

toadman

Guru

Similar threads