Dataset Backup Requirements

NAMoulton

Dabbler
Joined
Apr 10, 2023
Messages
19
Good Day Everyone!

I have been trying to recover my online privacy and running everything as locally managed as possible without cloud resources. I have been successfully running my TrueNAS Core (v13.0-U4) with a NextCloud jail for nearly a year now, and absolutely love it. Although, I am stuck with one issue in my deployment - an offsite backup. I currently have a 10TB capacity NAS with about 2TB of personal/family data stored.

My major question is: Are the recurring config & dataset Snapshots enough to upload offsite to recover a total loss to my current NAS? Or do I need to have a fully replicated dataset?

I would be okay with a cloud service as long as they're encrypted/trustworthy. Especially since the alternative seems like creating another TrueNAS and asking a family member to keep it (no one else in my family is really that techy - they barely change the default router passwords let alone know what to do with a mini server haha).
 
Joined
Oct 22, 2019
Messages
3,641
Are the recurring config & dataset Snapshots enough to upload offsite to recover a total loss to my current NAS? Or do I need to have a fully replicated dataset?
You need the capacity to house the total amount "referenced" by the most recent snapshot, at minimum. (Even greater capacity if you want to also backup older snapshots that refer to deleted data.)
 

NAMoulton

Dabbler
Joined
Apr 10, 2023
Messages
19
I have auto-Snapshots scheduled every night with 30-day retention. I just looked at the most recent Snapshot, and I have 1.86TiB of "Referenced" data. If I were to set up a 4TB storage solution, how/what would I need to do to have a fully ready catastrophic backup of my data?

Trying to get a grasp of this, does the Snapshot need a fully replicated offsite dataset for it to work? Or just the capacity to potentially rebuild itself to that "Referenced" size?
 
Joined
Oct 22, 2019
Messages
3,641
A snapshot is a complete (read-only, immutable) filesystem. The moment you "rollback" to a snapshot, your live filesystem now starts all over again from that "point in time".

Technically, as long as your most recent snapshot is replicated to another destination, then you have a "backup" of your entire filesystem of its state at the "point in time" that the snapshot was created.

It's not feasible to replicate an entire filesystem's worth of data every time you send a snapshot to another pool. That's where "incremental" sends come in. Only the difference between the previously backed-up snapshot and the most recent snapshot need to be transferred. (The end result is the same: a full filesystem at the most recent "point in time" the snapshot was created.)

You'll obviously need more than what is "referenced", since if you're going to keep doing regular backups, it's assumed you're going to keep your older snapshot as well (30 days). These older snapshots likely hold "deleted" data, which consumes space not-yet-freed. It's also worth noting, when it comes to NAS, ZFS, and data storage, you want sufficient "breathing" room. Don't plan to reach near capacity on any pool. If you hit that amount, it's a sign to expand the pool's capacity (or do some serious pruning and/or rethinking your data storage strategy and usage.)

The simplest approach is to tether a Replication Task to a Periodic Snapshot Task. (Making sure not to let more than 30 days go back between backups, otherwise you might lose the ability to do an incremental replication, since a needed base snapshot has been "pruned" on the source.)
 

NAMoulton

Dabbler
Joined
Apr 10, 2023
Messages
19
Technically, as long as your most recent snapshot is replicated to another destination, then you have a "backup" of your entire filesystem of its state at the "point in time" that the snapshot was created.
So I could rebuild a dataset with only a most recent Snapshot?...
These older snapshots likely hold "deleted" data, which consumes space not-yet-freed. It's also worth noting, when it comes to NAS, ZFS, and data storage, you want sufficient "breathing" room.
I hadn't thought about any deleted data still occupying space via older Snapshots. Interesting.
The simplest approach is to tether a Replication Task to a Periodic Snapshot Task. (Making sure not to let more than 30 days go back between backups, otherwise you might lose the ability to do an incremental replication, since a needed base snapshot has been "pruned" on the source.)
So I would just run a Replication Task every night after a Snapshot to send to the off-site?
 
Joined
Oct 22, 2019
Messages
3,641
So I could rebuild a dataset with only a most recent Snapshot?...
There's no "rebuilding". The snapshot is the filesystem.

Think of them as a read-only Word document every time you click "save". If you open up a "read-only" word document, you can use that to keep writing, while the read-only copy remains immutable.



So I would just run a Replication Task every night after a Snapshot to send to the off-site?
The GUI has its own integrated scheduling that can work in tandem with your snapshots.
 

NAMoulton

Dabbler
Joined
Apr 10, 2023
Messages
19
There's no "rebuilding". The snapshot is the filesystem.

Think of them as a read-only Word document every time you click "save". If you open up a "read-only" word document, you can use that to keep writing, while the read-only copy remains immutable.
Okay...this is my ignorance speaking, but how? My most recent Snapshot is 700KiB against the 1.86TiB Referenced dataset. That just doesn't math for me. I thought the math worked out because of the series of incremental Snapshots working together, but I would only need to the most recent Snapshot?...

The GUI has its own integrated scheduling that can work in tandem with your snapshots.

Yeah, I just took at look at this. This seems easy enough to set up.

Thoughts on using Backblaze as my offsite? I don't trust the major big names. THEIR servers are secure, but they also hold the encryption key. From a little reading, Backblaze doesn't store the key, only I would.
 
Joined
Oct 22, 2019
Messages
3,641
My most recent Snapshot is 700KiB against the 1.86TiB Referenced dataset.
Your most recent snapshot only consumes an additional 700 KiB. This is because when compared against all previous snapshots and your live filesystem, there's only 700 KiB worth of data that is unique to that snapshot. However, the snapshot itself, as a filesystem, references 1.86 TiB. This means if you just send this particular snapshot to another pool, you will end up with a new dataset on the pool that is 1.86 TiB in size.


Thoughts on using Backblaze as my offsite?
No experience with Backblaze, sorry.
 

NAMoulton

Dabbler
Joined
Apr 10, 2023
Messages
19
Your most recent snapshot only consumes an additional 700 KiB. This is because when compared against all previous snapshots and your live filesystem, there's only 700 KiB worth of data that is unique to that snapshot. However, the snapshot itself, as a filesystem, references 1.86 TiB. This means if you just send this particular snapshot to another pool, you will end up with a new dataset on the pool that is 1.86 TiB in size.
Okay, so my thinking wasn't too far off.

Sanity check, if I send that Snapshot to storage device offsite. It'll store it (and all the future Snapshots) even if it's not a ZFS filesystem just fine. Then if something happens to my TrueNAS server and everything is gone, do I just need to "pull" those Snapshots and it'll recover/rebuild(?) all my data to the point in time of the Snapshot?


Sorry if I'm not getting something haha. I GREATLY appreciate this education though!
 
Joined
Oct 22, 2019
Messages
3,641
It'll store it (and all the future Snapshots) even if it's not a ZFS filesystem just fine.
Snapshots are a ZFS property. You can't send ZFS snapshots to a non-ZFS destination. (It makes no sense.) You can "rsync" or "cp", but not replicated a ZFS snapshot (or incremental replication of a difference between snapshots.)
 

NAMoulton

Dabbler
Joined
Apr 10, 2023
Messages
19
Snapshots are a ZFS property. You can't send ZFS snapshots to a non-ZFS destination. (It makes no sense.) You can "rsync" or "cp", but not replicated a ZFS snapshot (or incremental replication of a difference between snapshots.)
So in order to do an offsite Snapshot auto-backup, I would HAVE to send it to another TrueNAS (or other ZFS) system? Well, that's kind of a bummer haha.
 
Joined
Oct 22, 2019
Messages
3,641
So in order to do an offsite Snapshot auto-backup, I would HAVE to send it to another TrueNAS (or other ZFS) system?
Pretty much.

You can still do file-based backups using your own software, or the built-in Rsync Tasks. Not nearly as efficient, nor do they preserve your ZFS properties and earlier snapshots (filesystems), but they are compatible with most systems.
 
Top