DR Testing

Status
Not open for further replies.

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
I have successfully setup a FreeNAS 9.10.2 box in my main office and am taking daily snapshots. I am replicating those offsite to another FreeNAS 9.10.2 box. I am testing DR so I understand what would need to do if disaster strikes - before moving us to production use.

I did a test of destroying (removing) the boot disk, and then freshly installing FreeNAS on new disks, and restoring my configuration from the .db file I have replicated to my remote location.

The next test I would like to do is destroying (removing) the storage disks in my main office, and then restore the data from my replicated snapshots at my remote office.

I have looked up many articles, but I am still not entirely clear on what the process should be.

Once I have installed working storage disk(s) into my main office FreeNAS, what should I do next to restore the data from my remote location?

I have 51 datasets. Would I have to recreate them all in my main location, and then simply clone and copy the file contents from the remote location to my main office? or is there a simpler command that could be used to send everything back to my main office?

Thank you!!!
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
No need to recreate anything on the main box before you replicate your data back. All you need to do is replicate it back through the GUI or manually with the zfs send | zfs receive command.
 
Last edited by a moderator:

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
I am reading, but failing to fully grasp this. So I have 51 datasets. One top-level dataset, that contains the other 50 beneath it nested in various levels.

I am SSH into my remote site. Do I try to send the existing snapshot files with ZFS send/receive, or do I create a new snapshot manually?
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
Here is what I am trying right now:

zfs snapshot -r Volume/nameofreplicationstoredataset@testingsnapshot
zfs send -R Volume/nameofreplicationdataset@testingsnapshot | ssh root@mainofficeIP "zfs receive KIKO_datastore"

KIKO_datastore is the name of the old top-level dataset.

This gave me an error however the KIKO_datastore already exists on main office, and that I had to use -F to overwrite. I tried that flag, but received this error:
Code:
cannot unmount '/var/db/system': Device busy

Instead I then tried:
zfs send -R Volume/nameofreplicationdataset@testingsnapshot | ssh root@mainofficeIP "zfs receive KIKO_datastore/KIKO_datastore"

This command is now running and is transferring back my data. However, due to add /KIKO_datastore, I have KIKO_datastore dataset, with KIKO_datastore dataset underneath, and then my other datasets - it added another "top-level" dataset, if that makes sense...

In my main office, my volume name was KIKO_datastore, then first dataset name KIKO_datastore, then my sub level datasets would branch off.

Is there something I should have done differently? I can easily blow it all out and test again in a few minutes.
 
Last edited by a moderator:

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
I stopped my test and am back to my previous configuration, with the disks removed to simulate disaster striking on my storage disks. I have 408 snapshots on my remote location of my KIKO_datastore dataset. I setup the original snapshot to be recursive. So I have daily snapshots of my KIKO_datastore dataset, and the other 50 below it, for about the last week or so. I have another meeting to go into now, but am going to play with this more this afternoon.

One other note, if relevant, is that my remote volume/dataset (configured in my replication task) where I am replicating to is KIKOExecutive/replicationstore. I am not sure how the different names will affect my ZFS send receive job yet.
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
I will let someone chime in more knowledge on troubleshooting your "zfs send/receive" command as I haven't personally used those perviously, but I know a lot of the veterans on here always recommend that.

Personally in the past when I have had to move over multiple datasets, I would just create a replication task in the GUI going the other way (from my backup box to my main box) and send the data that way. I don't know if this is the best way, but it has worked for me multiple times with no problems to speak of.
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
Thank you for your input, I will take that into consideration. I had been thinking of the replication route, but I would like to understand ZFS Send/Receive better.
 
Joined
Dec 2, 2015
Messages
730
I used zfs send and receive to move my data from my original server to its replacement, using the info in this thread as my starting point, plus some additional info from man zfs and Google.

Using the GUI, create a snapshot of the pool, selecting "Recusive snapshot", so it includes all the datasets on that pool. Or, you can use the CLI zfs snapshot -r pool_name@migrate.
Then, using the CLI, replicate the snapshot to the pool on the other machine. My notes are not complete, but I think the command I used was zfs send pool_name@migrate | ssh root@IP_of_new_machine "zfs receive pool_name_on_new_machine"

Good luck
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
I am trying this now. I took a snapshot on my remote install using
zfs snapshot -r KikoExecutive/replicationstore@manualtest
I then am issuing this command to send it back to my main office
zfs send -R KikoExecutive/replicationstore@manualtest | ssh root@myipaddress "zfs receive KIKO_datastore"

Upon running this command I receive the following output:
Code:
cannot receive new filesystem stream: destination 'KIKO_datastore' exists
must specify -F to overwrite it


so then I add the -F flag, and the output is then:
Code:
cannot unmount '/var/db/system': Device busy


Thoughts?

I do have a very basic question as I am learning. When you, or really anyone, are referring to a "pool", what exactly is considered a pool? So far in reading the instructions, I am familiar with volumes, and then creating datasets underneath of them. Is pool synonymous with volume, or dataset? In my remote server, my replicated data is located under KikoExecutive(volume) - replicationstore (dataset). In my main server it was stored in KIKO_datastore (volume) - KIKO_datastore (dataset - created automatically it appears and I cannot delete or rename that I can tell)
 
Joined
Dec 2, 2015
Messages
730
I am trying this now. I took a snapshot on my remote install using
zfs snapshot -r KikoExecutive/replicationstore@manualtest
I then am issuing this command to send it back to my main office
zfs send -R KikoExecutive/replicationstore@manualtest | ssh root@myipaddress "zfs receive KIKO_datastore"

Upon running this command I receive the following output:
Code:
cannot receive new filesystem stream: destination 'KIKO_datastore' exists
must specify -F to overwrite it
You need to delete the 'KIKO_datastore' dataset.
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
When I try to delete it from the GUI I get this error:
Code:
cannot unmount '/var/db/system/samba4': Device busy


So then I realized that is probably SMB service, so I stopped that and tried to delete, but got this error:
Code:
cannot unmount '/var/db/system/syslog-9aa167b9e1f74d34bc7e115be716e370': Device busy


I didn't actually create this dataset...when I created the new volume it just did it automatically it seems.
 
Joined
Dec 2, 2015
Messages
730
Hmm. I recall have a similar problem when I transferred my data, but I don't recall the specific steps I took to resolve it. I do know that there cannot be any existing datasets that conflict with what you are trying to create via replication.

Things to try:
  1. Restart FreeNAS on the new machine, then try deleting the dataset again, using the GUI
  2. Delete the dataset using the CLI, then restart FreeNAS.
  3. Detach and destroy the whole volume on the new machine, then recreate it.
Pools vs volumes - the problem is that the FreeNAS GUI is somewhat confusing. The nomenclature used differs somewhat from what ZFS uses. When you create a volume using the GUI, you are creating a zpool, as I understand it. The ZFS docs (and the ZFS commands) use the word "volume" to refer to a specific type of dataset. Very confusing.
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
1. Restarted and tried to delete; same errors
2. Receive same errors via CLI
3. Tried this, but then still automatically creates this first dataset underneath. Perhaps I am not understanding how it works.

RE: item #1 and #2 - I realized the system dataset is located there, so I think that is why it won't let me delete it.

I have attached a screenshot of my "Volumes" tab, showing my single KIKO_datastore volume, and my KIKO_datastore dataset underneath. This is the dataset that it will not allow me to delete, I assume because the system dataset is on it. I thought I would then try to create another dataset under the volume in parallel with the "KIKO_datastore" dataset but I cannot do this. Is there always a hierarchy of {volume name} then {first dataset with matching name as volume} and then underneath of that dataset is where the admin has the ability to create their own datasets? I think having a firm understanding of this is where my disconnect is.
 

Attachments

  • Screen Shot 2017-01-26 at 9.40.31 AM.png
    Screen Shot 2017-01-26 at 9.40.31 AM.png
    79.1 KB · Views: 241
Joined
Dec 2, 2015
Messages
730
System dataset - I had that issue too. You'll need to move it somewhere else. Either to the boot volume (OK if you have a boot SSD, but not recommended if you are using USB flash sticks), or to a dataset that will not be overwritten by this replication.
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
Well, I just had a good experience of why you do this in TEST!

I decided to just try to push back my data to a different dataset on the original server, so I ran
zfs send -R KikoExecutive/replicationstore@manualtest | ssh root@myipaddress "zfs receive -F KIKO_datastore/KIKO_Fileshare"

And so this started bringing everything back. However, being as it is a test, and I didn't want to wait countless hours for my 60GB to transfer back over a WAN link - I cancelled the cmd in terminal with a CTRL - C.

I then moved on my MAIN server my system dataset to the boot disk as you indicated.

To my next astonishment, and I have no idea how yet, my snapshots and replicated data on my remote location are GONE. I don't even see the replicationstore dataset any more!

Now, I looked back on my main server, and since restoring my configuration today for a "clean slate" - my periodic snapshot task was enabled, as well as my replication task, and my replication task is set to "delete stale snapshots on remote system". However my main server has not created any snapshots to replicate over to my remote site. My remote site now lists 0 snapshots. I am really trying to wrap my head around what I could have done that poof my data is gone.

In the GUI for my remote site, I don't see the replicationstore dataset at all. However in the CLI I can see it, and two directories underneath it, which contain no files or folders.

I am continuing to investigate.
 
Last edited by a moderator:

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
revised above to indicate that I moved system dataset on my Main new server, not my remote location with my replicated data.
 
Joined
Dec 2, 2015
Messages
730
This is also quite familiar. It looks like the GUI and CLI are out of sync. Restart the server that is not showing snapshots.
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
I will try that shortly. I am not seeing the snapshots in CLI either though on my remote server...very strange.
 

Derek Humes

Dabbler
Joined
Feb 22, 2016
Messages
44
Still gone after a restart of the remote server. It is just so strange that the entire replicationstore dataset is gone. Even if the replication task had deleted stale snapshots, the replicationstore location on the remote server was still where it would have gone and should not have been deleted. I combed through the SSH window to see if I had somehow issued an incorrect command that deleted KikoExecutive/replicationstore from my remote server, but I don't see anything standing out.
 
Status
Not open for further replies.
Top