yes, the title is not the best, so let me explain:
I have 2 servers, obviously running ZFS. One server hosts XenServer VM through NFS, the other is a backup server. For the sake of this discussion, all VMs are Linux Web servers, running some sort of control panel (Directadmin, cPanel, etc).
Sounds simple at first, but now for the complication:
leaving aside the (direct) ZFS backups, snapshots, replication etc, the backup server also acts as an internal backup for the user files. This is done, by letting the (above mentioned) control panel, to backup each account's (user's) files/db and put it in a specific /directory. From there, we have an external script, rsync-ing the files each day, from each VMs /directory into the backup server.
Now for the real complication, the actual structure and layout (for a single VM, example):
Except the redundant network overhead, the RSYNC script reads and write from (eventually) the same server, causing a little bit of "stress" on the ZFS and even though those are tar files, I see a lot of "little" reads/writes bursts instead of big sequential IO.
I've tried several strategies, including:
so... after the long introduction, I'd be more than happy to hear any suggestions on a better strategy for this.
Just to be clear, while I could change the ZFS layout (raidz, mirror, ssd, etc) the point here is to first have a better strategy. For the moment, I am fine with less then optimal performance from the pools.
Meaning, if possible, let's avoid a conversation about the actual ZFS layout, and server specs, and focus more on the how to backups and were to put things.
And just a last tiny reminder, again, this is not a pure ZFS replication, snapshot discussion. Those are working, and they are going to keep working... :)
thanks in advance!
I have 2 servers, obviously running ZFS. One server hosts XenServer VM through NFS, the other is a backup server. For the sake of this discussion, all VMs are Linux Web servers, running some sort of control panel (Directadmin, cPanel, etc).
Sounds simple at first, but now for the complication:
leaving aside the (direct) ZFS backups, snapshots, replication etc, the backup server also acts as an internal backup for the user files. This is done, by letting the (above mentioned) control panel, to backup each account's (user's) files/db and put it in a specific /directory. From there, we have an external script, rsync-ing the files each day, from each VMs /directory into the backup server.
Now for the real complication, the actual structure and layout (for a single VM, example):
- Control panel - running it's internal backup script, creating a tar file for each user.
this is a very read intensive task, specially because user's files are mostly small php, causing a lot of random reads. - Temp backup location - the CP script, stores the tar files in a /directory, which is actually an NFS mount to the backup ZFS server. This is done for 2 reasons:
- avoid the writes on the main ZFS server.
- have more space, and actually prevent the space from running out on the main ZFS.
- RSYNC script - takes the tar files from each VM's /directory, and puts them in the right (and internal) place. This is done to keep history of backups, saving up to 3 months of those tar files, for each user.
Except the redundant network overhead, the RSYNC script reads and write from (eventually) the same server, causing a little bit of "stress" on the ZFS and even though those are tar files, I see a lot of "little" reads/writes bursts instead of big sequential IO.
I've tried several strategies, including:
- tar.gz instead of tar
- also dedup on the backup server, because eventually the sum of all those php files are practically the same (wordpress, joomla, drupal, etc), but I guess with tar files, it's not really doing any good.
so... after the long introduction, I'd be more than happy to hear any suggestions on a better strategy for this.
Just to be clear, while I could change the ZFS layout (raidz, mirror, ssd, etc) the point here is to first have a better strategy. For the moment, I am fine with less then optimal performance from the pools.
Meaning, if possible, let's avoid a conversation about the actual ZFS layout, and server specs, and focus more on the how to backups and were to put things.
And just a last tiny reminder, again, this is not a pure ZFS replication, snapshot discussion. Those are working, and they are going to keep working... :)
thanks in advance!