Replication not working on new FREENAS Servers

Status
Not open for further replies.

Kkaplan

Cadet
Joined
Mar 17, 2016
Messages
2
Hi, I am new to Freenas so I hope you can bear with me.

We are trying to setup a very simple server system, with 2 FREENAS Servers, 1 as a primary and the other as backup. We are trying to use the primary as a "PUSH" server and use the FREENAS automated snapshots and replication task to create a backup onto a "PULL" server.

We tried this in a test environment and everything was successful. When I moved to the production environment the first few replications were also successful. From then on it has been failing.

The latest error from the replication task on the primary is
TIMEOUT SERVER "ip address" NOT RESPONDING.

*********************************************************************************
The System Details

We have setup 2 FREENAS Servers that have identical hardware
Build FreeNAS-9.2.1.7-RELEASE-x86 (fdbe9a0)
Platform Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz
Memory 3450MB

Each Server has a single 1TB drive with a single ZFS volume formated as STRIPE

Server1 (primary) has
Volume: FILESERVER
DataSet: FILESVDS

Server 2 (bkup) has
Volume; FILESVBKUP
Dataset BKUPDS

The snapshot task on the primary is set to create snapshots every 4 hours 8am - 6pm
The replication task is set to run 7pm - 1159pm

**************************************************************************************
I recently tried removing the Dataset and Volume (BKUPDS and FILESVBKUP) from
the bkup ("PULL") server and reestablishing them.

Of course that process also deleted the 3 replicated snapshots that were already on
the bkup server. With the BKUP ("PULL") server now empty, i changed the time on the replication task
to allow it all day. Then changed the automated snapshot to 15min. I saw a new snapshot
on the primary server, and the replication task started. I then reverted the automated snapshot
task back to 4hours. The replication on the primary continued for a while and then
failed after it reached about 60%. That is when I received the error
TIMEOUT SERVER "ip address" NOT RESPONDING.

My questions are:

1. Does the replication task run correctly if you have a CIFS share open on the backup server?
2. Are there any additional logs I should look at besides /VAR/LOG/message /VAR/LOG/auth.log
2. There is about 85GB of data on the primary server. What is the best way to do the first
replication. Should I use ZFS Send from the command line?? Or should I let
the tasks run from the GUI.
3. Are there any issues with the snapshot / replication task on the primary if there are files
open when the tasks run?
4. How can I check to see which users are connected / which files are open on the server?


**************************************************************************************************
My next test is going to be.
1. Make sure the bkup ("pull") server has no snapshots and no data
2. Delete all of the snapshots on the primary ("push") server.
3. Reestablish the Snapshot and Replication tasks on the Primary ("PUSH") server
4. Let the tasks run and see if the replication completes.

After this I think I am stumped so any help would be appreciated.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
A couple things.
1. Are you sure your production network and connectivity are stable?
2. It's quite possible that you are running out of RAM. The next time you do replication, monitor the usage graphs.

1. Does the replication task run correctly if you have a CIFS share open on the backup server?
Yes, it should.
2b. There is about 85GB of data on the primary server. What is the best way to do the first replication. Should I use ZFS Send from the command line?? Or should I let the tasks run from the GUI.
Use the GUI. 85GB should not take very long.
3. Are there any issues with the snapshot / replication task on the primary if there are files
open when the tasks run?
No. Replication only replicates snapshots, so nothing is "open"
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

Kkaplan

Cadet
Joined
Mar 17, 2016
Messages
2
Depasseg .. Thanks for the info.

I know we are a little low on RAM. We dont expect to have much usage on a daily basis. These machines are basically for backup.
I also read that given my hardware configuration I should be using he X64 version of FREENAS. Once we have a good understanding
we will probably upgrade.

*********** Replication Task Update ***********
I Removed all the snapshots on the primary server and then deleted everything on the Backup Server's dataset.
I deleted the replication task on the primary server and reestablished a new task.
Then I restarted the Replication Task and Automated Snapshots on the primary server and everything
replicated correctly. Yea!!!

I guess the final result is, if there is any discontinuity between the replication task / and the "Pull" dataset being replicated
then the replication system may get stuck???
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I know we are a little low on RAM. We dont expect to have much usage on a daily basis. These machines are basically for backup.
I also read that given my hardware configuration I should be using he X64 version of FREENAS. Once we have a good understanding
we will probably upgrade.

Short version: you can lose all your data if you don't have at least 8 GB of RAM whatever your usage is.
 
Status
Not open for further replies.
Top