Offsite Backup Best Practice?

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Hello Everyone.

I'm currently speccing out a FreeNAS build for my family to keep both sensitive and non-sensitive data/documents resiliently stored, and I'm trying to minimize the size of the machine so it can be carried out of the house easily in the event of a hurricane by a not-so-strong person (Mini-ITX case, 6-8 2.5" 1TB SATA SSDs). In addition, I'm trying to plan for making regular backups which will go off-site to a secure location.

My questions are:
1) If I fill up all of the SATA ports for my 1 Z2 pool, can I use an external drive/enclosure off a USB connection to back up the entire pool, or should I plan to directly attach a single big drive for replication/backup on the motherboard?

2) If I can't use a USB drive and don't have any spare SATA ports, is backing up over the network to a drive on a PC generally the way forward?

3) Should I back up the entire pool to a single enterprise/datacentre HDD, or would it be better to back up/replicate to multiple drives? I'm not entirely clear on how backing up a Raid Z works when it comes to restoring. I.E., will the 2 copies of parity and checksums on the backup drive(s) "generally" mitigate drive read errors during a restore?
 
Joined
Oct 18, 2018
Messages
969
I'm currently speccing out a FreeNAS build for my family to keep both sensitive and non-sensitive data/documents resiliently stored, and I'm trying to minimize the size of the machine so it can be carried out of the house easily in the event of a hurricane by a not-so-strong person (Mini-ITX case, 6-8 2.5" 1TB SATA SSDs). In addition, I'm trying to plan for making regular backups which will go off-site to a secure location.
Interesting consideration, I hadn't thought of that as i don't live in a hurricane area. Could you instead get a hot-swap case and in the event you have to leave quickly pull the drives from the system? It will be less limiting on the build, I think.

1) If I fill up all of the SATA ports for my 1 Z2 pool, can I use an external drive/enclosure off a USB connection to back up the entire pool, or should I plan to directly attach a single big drive for replication/backup on the motherboard?
USB drives are, in general, a bad idea for storage with FreeNAS but some folks do use them for backups. The issue with USB drives is that they contain hardware that often prevents FreeNAS from having full and complete access to the disks and if the hardware is cheaply made you could experience low shelf-life and high failure rates. That being said, many folks use them for backups because the risks are outweighed by the convenience and they might not have other good options. One thing to consider is to get a case with hot-swap bays. You could use a few of those to plug in your backups and remove them for off-site storage when required.

2) If I can't use a USB drive and don't have any spare SATA ports, is backing up over the network to a drive on a PC generally the way forward?
I do this, except I don't back up to a PC. I back up to a cheaply built 2nd FreeNAS server. I bought a used chassis, used board, used cpu, new memory, and new drives to get the cost as low as I could and then I set up zfs replication between my primary server and my backup server.

3) Should I back up the entire pool to a single enterprise/datacentre HDD, or would it be better to back up/replicate to multiple drives? I'm not entirely clear on how backing up a Raid Z works when it comes to restoring. I.E., will the 2 copies of parity and checksums on the backup drive(s) "generally" mitigate drive read errors during a restore?
The general rule of thumb for backups is the 3-2-1 rule. You should have 3 copies of your data, stored in at least two different mediums, and at least 1 of them should be offsite. Here is how I do it.
Copy 1: Primary server
Copy 2: Backup server stored in a pool with a mirror vdev. received updates from main server constantly via zfs replication
Copy 3: Pool with a mirror vdev. Swapped out regularly with copy 2 such that I have 3 copies of all but the absolute most recent data and 2 copies of that.

I don't store my data on 2 different mediums, but I think it follows the spirit closely enough especially given that my backups are mirror vdevs so even if Copy 1 and Copy 2 were destroyed by fire my Copy 3 would actually be 2 copies of the data. Exactly how you choose to apply the rules in your case it up to you. How horrible would it be if something happened to your main server and your backups didn't work out? If that would be the worst thing ever consider doing something more similar to what I do. If you're comfortable with the risk of having your final copy be a single copy on a usb drive, that is fine too.

Exactly what the risks are is hard to estimate. But hopefully this gives you some idea of possible routes to take.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Interesting consideration, I hadn't thought of that as i don't live in a hurricane area. Could you instead get a hot-swap case and in the event you have to leave quickly pull the drives from the system? It will be less limiting on the build, I think.


USB drives are, in general, a bad idea for storage with FreeNAS but some folks do use them for backups. The issue with USB drives is that they contain hardware that often prevents FreeNAS from having full and complete access to the disks and if the hardware is cheaply made you could experience low shelf-life and high failure rates. That being said, many folks use them for backups because the risks are outweighed by the convenience and they might not have other good options. One thing to consider is to get a case with hot-swap bays. You could use a few of those to plug in your backups and remove them for off-site storage when required.


I do this, except I don't back up to a PC. I back up to a cheaply built 2nd FreeNAS server. I bought a used chassis, used board, used cpu, new memory, and new drives to get the cost as low as I could and then I set up zfs replication between my primary server and my backup server.


The general rule of thumb for backups is the 3-2-1 rule. You should have 3 copies of your data, stored in at least two different mediums, and at least 1 of them should be offsite. Here is how I do it.
Copy 1: Primary server
Copy 2: Backup server stored in a pool with a mirror vdev. received updates from main server constantly via zfs replication
Copy 3: Pool with a mirror vdev. Swapped out regularly with copy 2 such that I have 3 copies of all but the absolute most recent data and 2 copies of that.

I don't store my data on 2 different mediums, but I think it follows the spirit closely enough especially given that my backups are mirror vdevs so even if Copy 1 and Copy 2 were destroyed by fire my Copy 3 would actually be 2 copies of the data. Exactly how you choose to apply the rules in your case it up to you. How horrible would it be if something happened to your main server and your backups didn't work out? If that would be the worst thing ever consider doing something more similar to what I do. If you're comfortable with the risk of having your final copy be a single copy on a usb drive, that is fine too.

Exactly what the risks are is hard to estimate. But hopefully this gives you some idea of possible routes to take.
A) Thanks for the informative and congenial post. This seemes like a pretty high-standards, high-strung community and half-thought I'd get chewed for asking newbie questions :D

B) The case I have in mind is the Fractal Design Define Nano S. Full build: https://pcpartpicker.com/list/HHXWzN. Ignore the drive mount compatibility warning. The 3.5" mount points have 2x2.5" mounting compatibility.

C) The cost is already getting pretty high, a virtue of using solid state storage of course, but the benefits of partial power loss protection on the pool drives, write speed (we will have up to 4 PCs pushing full-drive-image backups over 10GbE), and light weight (with physical shock/vibration resistance) in the event the one frail person is alone and has to get up and go in an emergency make it worthwhile.

D) We can't backup/replicate over the world wide web. Even on a commercial, fiber internet line, our speed is limited to 10Mbps, not even joking. So backing up to a 2nd server seems pointless since it isn't actually off-site. Currently we're planning on putting the backup of the NAS on one 3.5" Datacentre HDD and carrying it to/from its off-site secure location in a rubber+foam-filled case. After all, even eneterprise drives are ridiculously sensitive to vibration and physical shock.

So I guess my question about backing up and restoring the NAS in a catastrophic failure is this:
does the replication put the pool data and both copies of the parity and checksums on the backup media? If so, then during a restore, if drive read errors are encountered, is the restore process smart enough to check the copies of parity/checksums (or, if the error is while reading parity/checksum, to then check the corresponding data sector) to get around the read error? If it doesn't copy over the parity and checksums, does it only copy the data into whatever amount of space is currently being utilised?
 
Joined
Oct 18, 2018
Messages
969
A) Thanks for the informative and congenial post. This seemes like a pretty high-standards, high-strung community and half-thought I'd get chewed for asking newbie questions :D
I think these forums have grown out of those days, or at least we are trying to. :) I'm glad you gave us a chance!

B) The case I have in mind is the Fractal Design Define Nano S. Full build: https://pcpartpicker.com/list/HHXWzN. Ignore the drive mount compatibility warning. The 3.5" mount points have 2x2.5" mounting compatibility.
I did a build in a Fractal Design Define R6 as my first FreeNAS build and LOVED it. With a few mods I was able to get 12 drives in there. It didn't have hot-swap so it wouldn't help you with avoiding either rebooting your machine to swap backup drives or using usb drives, but I have a high opinion of Fractal Design as a result. If you opt to go for the non-hot-swap, usb backup option FD is a great choice in my opinion.

C) The cost is already getting pretty high, a virtue of using solid state storage of course, but the benefits of partial power loss protection on the pool drives, write speed (we will have up to 4 PCs pushing full-drive-image backups over 10GbE), and light weight (with physical shock/vibration resistance) in the event the one frail person is alone and has to get up and go in an emergency make it worthwhile.
Yeah, costs are certainly a limiting factor.

D) We can't backup/replicate over the world wide web. Even on a commercial, fiber internet line, our speed is limited to 10Mbps, not even joking. So backing up to a 2nd server seems pointless since it isn't actually off-site. Currently we're planning on putting the backup of the NAS on one 3.5" Datacentre HDD and carrying it to/from its off-site secure location in a rubber+foam-filled case. After all, even eneterprise drives are ridiculously sensitive to vibration and physical shock.
The benefit of backing up to a second server is that if your primary server has an issue that compromises data such as an electrical issue, small fire, virus, etc then your secondary server may be fine and can be used to recover data easily. You're absolutely right that a second server in the house doesn't protect you from everything, but it will give you better protection if something happens to the main one. I opted for that route half in an effort to avoid usb drives and half to feed by computer building addiction.

The careful rotation of drives off-site is a good idea if you're not doing any sort of cloud backup etc. Fortunately drives which are not spinning whose heads are parked are a lot more resilient than spinning drives whose heads are not parked. I do something similar though with my off-site drives. Its just so easy to protect them a bit more and the downsides of dropping one and damaging it are too high.

does the replication put the pool data and both copies of the parity and checksums on the backup media?
If I understand your question correctly, yes. Replication sends a copy of the data from one machine to the other and if you're using a pool on the backup side which has good redundancy in the vdevs you'll get all of the benefits of parity and protection from that. You could restore from the backup and your data would be exactly as it was; though if you do not save your configs as well you might need to reconfigure your sharing etc.

If so, then during a restore, if drive read errors are encountered, is the restore process smart enough to check the copies of parity/checksums (or, if the error is while reading parity/checksum, to then check the corresponding data sector) to get around the read error?
This will depend on your backup media. Even with checksums if you do not have any parity in your backup media if there is an unrecoverable read error you're not going to be able to recover from it, even if you notice it with a checksum. For that reason I use mirror vdevs as my backup. if reading from the backup encounters an unrecoverable read error I have the second disk to recover the data from and because it is a zfs pool I likely wouldn't even notice the issue unless both disks encountered a problem at the exact same spot, which is extremely unlikely.

If it doesn't copy over the parity and checksums, does it only copy the data into whatever amount of space is currently being utilised?
It definitely won't copy the "parity". Parity is obtained by having parity at the disk level. if you backup from a pool which has parity because it uses a RAIDZ2 vdev to a pool composed of a single disk, you won't get any parity. Checksums are different, they are a part of how zfs stores data. If you're backing up to a zfs pool you're going to get checksums on that backed up data just in virtue of the fact that you're using zfs on both ends. So, yes, if you do replication you get your checksums. if you want your parity too you'll want to use a mirror, RAIDZ1, RAIDZ2, or RAIDZ3 backup vdev.

Anyway, I hope this helps answer some of your questions. I view backups the same way I view ECC memory or using server vs consumer boards. Out of principle I don't think using non-ECC memory, usb backup disks, or consumer boards with FreeNAS is wrong. Lots of folks have great builds with hardware like that. I do think it is worth knowing the tradeoffs though. If I used usb backup drives for my movie collection and I lost it all one day maybe that isn't a big deal, I can always get the movies back. But if I did it with my tax information or personal family photos maybe I see the risk differently, or maybe I don't. So, whatever choice you make is the right one so long as you know the trade-offs and you're comfortable with them. Like I said above, a BIG motivating factor for my backup strategy was so I could build another machine. I was also interested in more secure data, so I got both with my strategy. That definitely isn't right for everyone though.

edit:
Some quick thoughts about your build. It looks like it has an intel NIC, which is important. Realtek for the primary NIC has given some folks performance issues. You likely already know this, but having that single PCIe 4.0 x 16 slot is leaving a LOT of bandwidth on the table. Any time you use a board of this form factor you're kinda in this place. It might not be an issue for you since you've got 10Gb on board but if you wanted to expand your system you've only got 1 slot to do it in. I did a quick search about whether oculink is supported well on FreeNAS and didn't see anything definitive one way or the other, something to look in to. If it doesn't work well you could always get a used HBA such as this one. Also, I imagine you're short on space in the case so the M.2 is useful to boot of off to save the SSD bays. In general, keep in mind that the boot pool doesn't need to be fast. You could boot of old laptop HDDs if you wanted. M.2 really shines with SLOG devices, but that doesn't apply to every build.

Sorry for being long-winded. Very cool build though!
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Could you instead get a hot-swap case and in the event you have to leave quickly pull the drives from the system?
I've done that a couple of times, for that exact reason--heading up the Interstate with a laundry basket full of hard drives (yes, they were padded). It did mean we couldn't stream Plex movies to the hotel, though...
 
Joined
Oct 18, 2018
Messages
969
I've done that a couple of times, for that exact reason--heading up the Interstate with a laundry basket full of hard drives (yes, they were padded). It did mean we couldn't stream Plex movies to the hotel, though...
ha, I can only imagine. If something happened to my house other than family and pets I think my HDDs in my primary server are my most precious possessions. They have decades of family photos and memories on them I just wouldn't want to risk.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
I think these forums have grown out of those days, or at least we are trying to. :) I'm glad you gave us a chance!


I did a build in a Fractal Design Define R6 as my first FreeNAS build and LOVED it. With a few mods I was able to get 12 drives in there. It didn't have hot-swap so it wouldn't help you with avoiding either rebooting your machine to swap backup drives or using usb drives, but I have a high opinion of Fractal Design as a result. If you opt to go for the non-hot-swap, usb backup option FD is a great choice in my opinion.


Yeah, costs are certainly a limiting factor.


The benefit of backing up to a second server is that if your primary server has an issue that compromises data such as an electrical issue, small fire, virus, etc then your secondary server may be fine and can be used to recover data easily. You're absolutely right that a second server in the house doesn't protect you from everything, but it will give you better protection if something happens to the main one. I opted for that route half in an effort to avoid usb drives and half to feed by computer building addiction.

The careful rotation of drives off-site is a good idea if you're not doing any sort of cloud backup etc. Fortunately drives which are not spinning whose heads are parked are a lot more resilient than spinning drives whose heads are not parked. I do something similar though with my off-site drives. Its just so easy to protect them a bit more and the downsides of dropping one and damaging it are too high.

If I understand your question correctly, yes. Replication sends a copy of the data from one machine to the other and if you're using a pool on the backup side which has good redundancy in the vdevs you'll get all of the benefits of parity and protection from that. You could restore from the backup and your data would be exactly as it was; though if you do not save your configs as well you might need to reconfigure your sharing etc.

This will depend on your backup media. Even with checksums if you do not have any parity in your backup media if there is an unrecoverable read error you're not going to be able to recover from it, even if you notice it with a checksum. For that reason I use mirror vdevs as my backup. if reading from the backup encounters an unrecoverable read error I have the second disk to recover the data from and because it is a zfs pool I likely wouldn't even notice the issue unless both disks encountered a problem at the exact same spot, which is extremely unlikely.

It definitely won't copy the "parity". Parity is obtained by having parity at the disk level. if you backup from a pool which has parity because it uses a RAIDZ2 vdev to a pool composed of a single disk, you won't get any parity. Checksums are different, they are a part of how zfs stores data. If you're backing up to a zfs pool you're going to get checksums on that backed up data just in virtue of the fact that you're using zfs on both ends. So, yes, if you do replication you get your checksums. if you want your parity too you'll want to use a mirror, RAIDZ1, RAIDZ2, or RAIDZ3 backup vdev.

Anyway, I hope this helps answer some of your questions. I view backups the same way I view ECC memory or using server vs consumer boards. Out of principle I don't think using non-ECC memory, usb backup disks, or consumer boards with FreeNAS is wrong. What I do worry about is when folks make those choices without fully understanding the tradeoffs. If I used usb backup drives for my movie collection and I lost it all one day maybe that isn't a big deal, I can always get the movies back. But if I did it with my tax information or personal family photos maybe I see the risk differently, or maybe I don't. So, whatever choice you make is the right one so long as you know the trade-offs and you're comfortable with them. Like I said above, a BIG motivating factor for my backup strategy was so I could build another machine. I was also interested in more secure data, so I got both with my strategy. That definitely isn't right for everyone though.
Yeah, I'm going with ECC memory and partial power loss protection. And the NAS will be running on an APC UPS with a Seasonic 80+ Titanium PSU to keep the power steady & clean and give graceful shutdown on power loss. It will store taxes, medical scans, passport copies, and other sensitive, must-preserve items.

Okay, so to back up a Z2 pool ideally you'd want 3 drives so you get parity included. Oof... Would backing up to a Z1 pool be an endorsed compromise on cost/physical space so there is both parity and data even if there isn't the double parity of the main pool? Similarly, could you back up the Z2 to a Z3 if you wanted extra assurance the backup is more resilient?

And can you make a vdev from multiple USB-attached devices easily? If I switch USB ports, will that 2nd vdev effectively stop working or trigger a re-silvering? If so, ouch...
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I think my HDDs in my primary server are my most precious possessions.
The problem in my case is that, by setting up my storage as ZFS likes it (one big pool with all my storage), I've lost the ability to take only the important stuff with me. I've got 20+ TB of video that's all replaceable (it would be a hassle, but it's certainly do-able), mixed with a much smaller quantity of stuff that's really important. So to protect the important stuff, all 18 disks have to go.
 
Joined
Oct 18, 2018
Messages
969
Top