Replication of snapshots

Status
Not open for further replies.

kavermeer

Explorer
Joined
Oct 10, 2012
Messages
59
I am planning on setting up a FreeNAS-based storage solution. It'll be mainly for storage of large data files in a lab/research environment. No databases, just plain files. Most of the data is added, only a part of it is edited/changed over time. Network connection is Gigabit ethernet. I need about 10 TB of usable storage.

One of the requirements is that data is saved twice. So I am considering setting up two similar systems with 5 3 TB disks in RAIDZ1 . System 1 is then the system that users connect to, and makes regular snapshots. System 2 would be the backup system. System 1 sends its latest snapshot to System 2 after office hours. In the end, this should give me regular snapshots on System 1, and daily snapshots on System 2.

Being totally new to this, I was wondering if this makes sense, and if this fits within the standard FreeNAS capabilities. Suggestions for improvements are of course welcome as well.

An alternative would be to only store the most recent nightly snapshot of System 1 on System 2, but I'm not sure if it really has a large advantage in terms of required disk space.

Also, does FreeNAS automatically removes old snapshots when the disk fills up?

Thanks for any insights!

Best,
Koen
 

joshg678

Explorer
Joined
Sep 27, 2012
Messages
52
Well it sounds like you have a wonderful road ahead of you.

This sounds similar to a set of requirements that i have had to deal with for the gov.

I would reccommend:

Box 1
6 X 3TB RAID Z2
16GB ram
Higher Clocked CPU

Box 2
5 X 3TB RAID Z1
16GB ram
Slightly lower Clocked CPU

This way your primary box will be able to withstand Two Hard drive failures, and the second system a single drive failure. Both will have the same amount of usable storage (12TB~)

You could then setup ZFS replication between the two boxes: http://doc.freenas.org/index.php/Storage_Configuration

I would have a Directly connected Gigabit network between the two boxes, used for ZFS replication (In addition to the primary network). This way to avoide slowing the primary (user) network connection for the main box.

Also Setup Snapshots to the main box. You can set how long you want to keep them, and how often to take them.
 

mikeyr

Dabbler
Joined
Sep 19, 2011
Messages
20
Yes, this works. I'm doing something similar and it works really well. On the main box I have two different snapshots-- one is a daily that has a 2 week "expiration"; and a second that is a monthly with a 2 year expiration. This way I have daily snapshots for the last 2 weeks and then monthly snapshots for the last 2 years. The replication takes care of making sure the second box has a copy of all the snapshots. It takes very little disk space to keep a snapshot (unless perhaps a lot of big files have been deleted), so keeping a number of them around is not a problem.
 

kavermeer

Explorer
Joined
Oct 10, 2012
Messages
59
Thanks for the replies so far! I guess that the originally planned setup should work. I'll think about RAIDZ1 vs RAIDZ2, but I think in the end it's not that important. Also, if the price is about the same, I'd probably choose two identical setups, rather than one slightly faster and one slightly slower system. Although I may be able to find an old system that I can use for the second (i.e., the backup) system. I guess I'll move on to the hardware selection then.
 

Joshua Parker Ruehlig

Hall of Famer
Joined
Dec 5, 2011
Messages
5,949
Thanks for the replies so far! I guess that the originally planned setup should work. I'll think about RAIDZ1 vs RAIDZ2, but I think in the end it's not that important. Also, if the price is about the same, I'd probably choose two identical setups, rather than one slightly faster and one slightly slower system. Although I may be able to find an old system that I can use for the second (i.e., the backup) system. I guess I'll move on to the hardware selection then.

If you don't need as much space and have a bit of extra CPU I really like RAIDZ2. you sleep better at night (u should sleep really well with 2 systems =] ) also it gives you an extra copy to fight against bit rot when you scrub your zpools. (less likely to have permanent file corruption)
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Thanks for the replies so far! I guess that the originally planned setup should work. I'll think about RAIDZ1 vs RAIDZ2, but I think in the end it's not that important.
How important is it to rebuild the 10TB of data? You are using 3TB drives unless you are mirroring them you should use double-parity. How disruptive is it when the primary pool is lost and how long will it take to rebuild? Given this is for backups raidz2 is a good fit.
 

kavermeer

Explorer
Joined
Oct 10, 2012
Messages
59
Just a quote from the original message:

One of the requirements is that data is saved twice. So I am considering setting up two similar systems with 5 3 TB disks in RAIDZ1 . System 1 is then the system that users connect to, and makes regular snapshots. System 2 would be the backup system. System 1 sends its latest snapshot to System 2 after office hours. In the end, this should give me regular snapshots on System 1, and daily snapshots on System 2.
So: Yes, I do plan on having a backup. If two drives fail on System 1, I still have System 2. Downtime up to a day or two isn't that much of a problem, if the data is still accessible through System 2.

You're saying 'this is for backups'. What's 'this'?
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
You're saying 'this is for backups'. What's 'this'?
Actually, I said "Given this is". The use case.

System 1 is then the system that users connect to, and makes regular snapshots. System 2 would be the backup system.
It wasn't clear System 2 was going to be user accessible or even in the same location. If everything can fail over to System 2 you can try and argue for single-parity. IMHO, it's not worth it with that amount of data and 3TB disks shouldn't be used with raidz1.
 

joshg678

Explorer
Joined
Sep 27, 2012
Messages
52
IMHO, it's not worth it with that amount of data and 3TB disks shouldn't be used with raidz1.


I agree. I would at least do raidz2 on the primary system, and raidz1 on the secondary.
 
Status
Not open for further replies.
Top