Cloud Backup Snapshots using RClone?

BetYourBottom · Jun 14, 2020

I've been thinking about this quite a bit lately. It seems most backup solutions take forever to scan your files for change, or will upload everything every time, or don't allow easy versioning. (Rclone doesn't seem to have any versioning, Duplicati takes a long time to compile and send files for larger datasets)

I was wondering if it might be possible to directly send ZFS snapshots through to a cloud backup. Since the snapshots are virtually instant, all the detection for changes is handled by ZFS itself so you don't need lengthy checksuming or inaccurate timestamping. Snapshots can be diffed using tools built into ZFS by default, so filtering out only changed data is handled for free by ZFS as well. Versioning just comes with the territory when dealing with snapshots as well, so it seems like if there was a way to directly upload snapshots that would drastically simplify backups.

Based on all the reading I've been doing lately on ZFS and RClone, I came up with a command that I think should work (I haven't been able to test it yet, I'm moving drives and will use the freed ones later for a test pool). I'd like some feedback on if something like this could work as simply as I think it may.

zfs send pool@longestlife | gzip | gsplit --bytes=10M --filter "rclone rcat remote:path/to/$FILE"

The first command should send a full snapshot to stdout with longestlife referring to a periodic snapshot taken that expires the latest (I would like to integrate this with periodic snapshotting) to create a baseline. Then Gzip should compress the datastream that is being sent by zfs send and feed it into gsplit; gsplit then takes the stream chops it into 10MB chunks and sends it off to rclone to upload to a path on your remote device.

My current concerns is that this might either swamp your RAM by doing everything in stdout, and/or loads up 1000s of rclone instances at once causing other performance issues. I'm not an expert at how these commands work with stdout, but in the ideal case, I was hoping that there would be some blocking so that you'd only get so many instances of rclone open before everything pauses itself and no giant RAM cache is created.

If this would work out then for newer versions, the only thing that'd need to change would be the remote directory and zfs send command to be something like zfs send -i pool@longestlife pool@latestsnap, to send a diff of the two snapshots.

To do a full restore I would think something like this would work:

rclone cat remote:path/to/dir | zcat | zfs receive

Basically the reverse of the previous command, and would need to be run on each of the uploaded snapshots as well.

Issues I can see with this are:

it might be difficult to rebase the backup on a newer snapshot. So say you have a snapshot that lasts for 1 year, this backup solution would work until then but when it expires the only way I see it working right now would be to do a full upload based on the newest 1 year snapshot.
I don't know how a partial restore would work, I'm not familiar enough with zfs send and zfs receive to know how partial restores work using them to be able to figure out how it would work via the upload solution
I don't know how rclone would deal with errors during transfer, how much it'll confirm file integrity and how an error would propagate back to stop the backup should an issue arise.

I'd love some feedback on this especially if it's to tell me someone smarter has already figured out how to directly use zfs' tools to deal with backups.

hescominsoon · Jun 26, 2021

Cloud Storage for Offsite Backup

BetYourBottom · Jun 26, 2021

hescominsoon said:
Cloud Storage for Offsite Backup

I wasn't looking for a certain service, I was looking for something that could be used in a more generic sense.
Rclone can connect to a ton of different online storage providers and, thus, if a command like this works (I haven't been able to test it yet), then it'd let you send zfs snapshots to arbitrary providers.

They wouldn't even need a zfs backend to deal with the snapshots because this should send them as split up gzipped and encrypted files. So as long as they can store generic files and the restore command works, it shouldn't need any special support outside of what rclone provides.

hescominsoon · Jun 27, 2021

rlcone doesn't do snapshot transfers..i wish it did. Until it does your best bet is a provider like rsync which is native zfs and can meet your requirements...

BetYourBottom · Jun 27, 2021

hescominsoon said:
rlcone doesn't do snapshot transfers..i wish it did. Until it does your best bet is a provider like rsync which is native zfs and can meet your requirements...

Did you read the main post or are you just commenting based on the title?

hescominsoon · Jun 27, 2021

my apologies i thought you would read the documentation at rsync before snarking back... rsync setups up a remote zfs filesystem. you could then set that as a replication target using the gui of TNC/TN. then you have native zfs synchronization and it should check your boxes. Then rclone isn't necessary and the resource usage is inline with native ZFS send/receive.

BetYourBottom · Jun 27, 2021

hescominsoon said:
my apologies i thought you would read the documentation at rsync before snarking back...

I appreciate that you are the only person to actually reply but your first post was still a link with no additional comment (which is rude).
I replied nicely explaining why that service was not exactly relevant and restated why I'm interested in this specific idea/solution.
Then you seemingly ignored that comment to just say "rclone doesn't do snapshot transfers" as if I didn't just make an entire topic about a sequence of commands that was meant to allow for snapshot transfers over rclone.

My snark is because it seems you are ignoring the point of this topic.
A sequence of commands that, in theory, will let you send snapshots to arbitrary cloud providers via rclone, discussion on if the commands would work, and ways to make it work or improve it's function.

I also appreciate that you like that service or think that it would be especially convenient for snapshot replication. However, I have no interest in an additional cloud storage provider at the moment for various reasons and would like to continue exploring options that might allow me to transfer snapshots to generic storage providers.

tommythorn · Nov 5, 2021

I'm trying to do exactly what you are doing (actually been wanting to do this for years but only started recently).

I've spent many hours looking for ways to either get TrueNas Cloud Sync to do this or for replication to do it, but they are
not of this mindset so instead I've used a clunky approach so far:
- zfs send $snapshot | zstd $CRAZYOPTIONS | gpg -c > OneVeryHumongusFile
- and then further split it into ~ 17 GB chunks and moved them to pCloud with rclone copy.
For the incremental updates (zfs send -i $PARENT $CHILD) I hope I don't have to split it.

I hope to hear about cleaner ways to do this.

(Yes yes, we all know about rsync.net and just about all the cloud options out there. If it was an affordable option we wouldn't be looking for alternatives).

PS: pCloud because I got a lifetime account for a one-time fee, but I'm also considering Backblaze B2.

BetYourBottom · Nov 8, 2021

tommythorn said:
I'm trying to do exactly what you are doing (actually been wanting to do this for years but only started recently).

I've spent many hours looking for ways to either get TrueNas Cloud Sync to do this or for replication to do it, but they are
not of this mindset so instead I've used a clunky approach so far:
- zfs send $snapshot | zstd $CRAZYOPTIONS | gpg -c > OneVeryHumongusFile
- and then further split it into ~ 17 GB chunks and moved them to pCloud with rclone copy.
For the incremental updates (zfs send -i $PARENT $CHILD) I hope I don't have to split it.

I hope to hear about cleaner ways to do this.

(Yes yes, we all know about rsync.net and just about all the cloud options out there. If it was an affordable option we wouldn't be looking for alternatives).

PS: pCloud because I got a lifetime account for a one-time fee, but I'm also considering Backblaze B2.

That's really interesting. I'll have to look more into a breakdown of the commands to see how that goes.
Only thing that I'd have an issue with is outputting it all to a single giant file because that'd effectively eat your pool temporarily.

It would be nice if there could be an automated version of this in TrueNAS with some additional metadata work they could have it link up enough to allow for seeing what snapshots are uploaded and allow for more fine grained restores.

tommythorn · Nov 10, 2021

It was only the initial root snapshot (1.7 TiB) that was problematic. For the incremental one I literally just did
zfs send -i $ROOTSNAP $LASTESTSNAP > pCloud\ Drive/$NAME
and it was done in ~ 20 min. Note, I built the pCloud tools/fusefs myself from https://github.com/pcloudcom/console-client

Important Announcement for the TrueNAS Community.

Cloud Backup Snapshots using RClone?

BetYourBottom

Contributor

hescominsoon

Patron

BetYourBottom

Contributor

hescominsoon

Patron

BetYourBottom

Contributor

hescominsoon

Patron

BetYourBottom

Contributor

tommythorn

Cadet

BetYourBottom

Contributor

tommythorn

Cadet

Similar threads