FreeNAS '11' and replication

Status
Not open for further replies.

milkman_dan

Cadet
Joined
Mar 11, 2014
Messages
5
(If this isn't the most appropriate place to put this post, I apologize)

I'd like to talk about making replication between FreeNAS hosts better and more reliable in the next (real) release.

I've had several bad experiences with replication in the past on FreeNAS 8.x and 9.x, enough that I don't trust it or the middleware to do the 'right thing.' I was much less experienced with FreeNAS and Unix in general when these errors and bad things occurred, so I can't offer much insight into why they happened. Replications broke. That isn't the point of the thread.

One of the things I was most happy with in Corral was that replication seemed to have gotten a makeover. It (for me anyway) was already more reliable than 9.x and was more flexible.

I'll leave my feelings about having to roll back to 9.x out of this, because they're uh...highly negative, but I'll eventually cool off. Still, I'm very concerned about replication in 9.x and improving it in 11.

I'll try and be concise:

  • Replication in 9.x is inflexible - By which I mean, I can only replicate to a host when a periodic snapshot task is also running. Don't tie me to a snapshot schedule (or even have snaps at all, if I'm insane) just to replicate filesystems in the GUI. Replications have their own snaps in Corral and are asynchronous of snapshots in general, and this is definitely the Right Way to do this sort of thing.
  • Replication tasks need a first class management interface (and so do snapshots) - Replications need a 'top level' UX, where everything you need to set them up, monitor them, resume them, and most importantly, use them to restore your data is all in one place, in a coherant interface. Snapshots and replications are closely related, and should be in the same place UX-wise IMO. Tie-ins for doing things with snaps in the context of the 'storage' UX should still be there, but give me the 30k foot view of my snaps and replications, status, tasks and so on.
  • The Peering was cool - One thing that I intensely dislike about the current replication UX in 9.x is that the destination host basically doesn't 'know' anything about what's going on. Its replication UI doesn't reflect anything about the state of replication. It's just "well, I have these datasets, I guess?" It's 'dumb,' and the active peering of 10 was superior in every way. I can see that the slave knows about the link, that it's aware of the status of that link, it knows when it's behind or that there was an error with the replication, etc. This builds confidence that your backup regimen is working. It wasn't 100% in Corral, ( I had issues on some versions with the slave retaining normal and replication snaps past their expiry) but it was easy to see that it was working.
  • Recovery needs to be automatic, or at least, heavily guided - I understand iX can't anticipate every single use-case when designing a recovery experience, but right now, there basically isn't one. It feels like 'if you're using replication, we trust you know how to use it to recover.' It feels bad man. The UX, being aware of the replication target, should be able to be 'in on it' with regards to recovery as well. I should be able to go to the slave, tell it "I need these filesystems pushed back to the host, here's the mount point" and then walk away. Done. Restore the SMBs and NFSs and iSCSIs and such from a backup of the config, and I'm back up and running. This is a huge deal. At the very least, recovery needs to be fully and helpfully documented. Currently, there is nothing, no proceedure to even help a new user get started using replication to recover from a disaster, in the official documentation. This is an oversight. Also, I understand that to an extent, replicating to non-FreeNAS targets needs to be an option. However, the best integration, the best guarantees, needs to be with FreeNAS boxen on both ends of the replication. Thats where you should delineate between 'you seem to know what you're doing, good luck' and 'let us take care of that for you' on this experience.
I have myriad other concerns about 11, but if I had to grovel on the ground and beg for something from Corral to make it 'forward' into 11, this is it. Please, please, make replication a first-class citizen in your UX. Address the fragility and inflexibility in the current setup. 'Own' recovery as well, it will only make TrueNAS better.
 
Status
Not open for further replies.
Top