zpool import/export from command line

Status
Not open for further replies.

gdreade

Dabbler
Joined
Mar 11, 2015
Messages
34
I'm working on a command line tool that, for part of it's operation, needs to be able to import and export volumes (think about hot-swapping a disk that is not part of the main pool(s)). Playing around with my test FreeNAS system, I see that I can do the zpool import and export from the command line just fine, but in doing so the imported (single disk) pool does not show up in the FreeNAS web UI.

This is not necessarily a bad thing, because the only thing accessing the imported pool should be my tool anyway.

However, it did make me realize that there's apparently additional state that the web UI is using, somewhere, to keep track of its pools. For example, if I import the volume in the web UI and then export it via the command command line, the web UI goes into an error state (it thinks it still has the volume but has lost the backing storage).

Does anyone know offhand what this extra state mechanism looks like, how its controlled, etc, without needing to go peruse the source?

If I knew definitively that the web UI was never going to touch the imported pool that would suffice. Otherwise I'm trying to figure out what the potential interactions are.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Does anyone know offhand what this extra state mechanism looks like, how its controlled, etc, without needing to go peruse the source?
No, but I do know that everything will be different with version 10.x, the most relevant feature being a custom CLI that talks to the system the same way the GUI does, to keep everything consistent, so you might want to focus your effort on that platform instead. Note that it's still in alpha.
 

gdreade

Dabbler
Joined
Mar 11, 2015
Messages
34
Thanks. I will eventually look to supporting FreeNAS 10 but the tool needs to run on 9 because I have systems in production now that need it. However, the code to do these particular operations will be localized so supporting a new CLI when the time comes should not be too much of an issue provided that the output is reasonably easy to parse.

FWIW, this is a perl script for minimizing the pain involved in creating an offline disaster recovery (DR) copy of arbitrarily large pools to removable disks, per:
  1. Invoke a hook to quiesce any user-defined services
  2. Taking a snapshot of the pool
  3. Invoke a hook to reverse step (1)
  4. Bundling it up with either zfs send (snapshot based) or tar (file based); user's choice
  5. Managing the mounting / unmounting of the destination disk volumes
  6. Splitting the archive stream into separate files on the disk volumes (user defined max file size, default 2G)
  7. Add metadata and README type information, and a copy of the tool on at least one of the disk volumes in order to aid with recovery
  8. Remove the snapshot.
  9. Be able to reverse the whole process to perform a recovery from said archive volumes
  10. Finally, do all this using standard tools and file formats so that in a pinch, an experienced administrator can recover all the data without necessarily having the tool.
This is targeted to:
  1. small shops that don't necessarily have an enterprise backup system in place; or
  2. cases where the data quantities are such that doing DR over the network to another FreeNAS is not feasible; or
  3. cases where neither tape archiving nor network-based archiving are available
This is not a full backup solution and will not become one; it is only targeted at making "level zero" DR copies.

I'm restricting perl module use to those that come stock in FreeNAS 9, but also testing on FreeBSD 10. The majority of the code (sans ZFS-specific operations) runs on CentOS 6 with slight variations.

I'll share when it's done (that being defined as "good enough for my production data").
 

gdreade

Dabbler
Joined
Mar 11, 2015
Messages
34
I wouldn't be surprised if step 6 has an off-the-shelf solution available.

Well, there of course have been programs like split(1) around for years, and tar(1) has multi-volume capability, but I couldn't find anything that did quite what I was looking for so I implemented that part in perl. That module is already tested for split, join, file management, and volume management; I've implemented a base module that does this for an arbitrary data stream. What remains is feeding that module with the snapshot data during backup, automating the zfs import/export, and extracting the data stream during recovery.

For me, split's problem is that there is no hook that can be invoked when switching output files. Tar's problem is that you either have fixed-size tar volumes, or you have to emulate a character-special device that tar thinks is a tape drive (and therefore can expect to detect end-of-tape). Plus, not all versions of tar have multi-volume capability (and I want to be able to use a variation of this tool on such systems).

For this tool, the volume change will be triggered when any of the following conditions are met:
  1. When the volume usage is past a given percent (default 100%). Any space reserved for the superuser is ignored.
  2. When the available volume space drops below a fixed value (default 1MB).
  3. If specified, when the space used by this archive exceeds a fixed value. There is no default on this one, as it is normally just used for testing purposes.
Regarding (1), the default is 100% because this is intended for write-once situations and we're not concerned with fragmentation or snapshots on the archive media. If anyone knows any good reason why the default should not be 100%, please speak up.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Don't use Perl, please! In today's world Python is a much better choice, Perl is dead.
 

Bigtexun

Dabbler
Joined
Jul 17, 2014
Messages
33
This is an interesting approach. I actually did something similar under FreeBSD, but my approach was entirely different.

We had been using an enterprise tape system, and tapes were shipped off-site on a weekly basis. When I came on the job, I was asked to restore some data, so I tried to read the tape and got nothing. So I started an end to end evaluation of the backup process. The first thing I noticed was the read/write stats on the tapes showed lots of writes, but not enough reads to have read the tapes even once. This means no backup tape had ever been verified, and for me this means there is no backup. I went into the UI of the tape system, and turned on the verification step, and we never had a "successful" backup again. Turns out the tapes were not readable, and hadn't been in years. Previous staff blindly fed the tape robot, and must have never had any requests to restore data. So I started evaluating options, ranging from replacing the tape drives in the robot, buying a new enterprise backup, or doing something else. Replacing the drives was going to cost around 10k, and there was the problem of stale software licenses and support that were going to get sticky. And the tapes were expensive, and flushing through a fresh set of tapes was going to triple the overall cost. Management didn't like the sudden expense, but they realized they had a huge exposure.

After analyzing all of the options, I asked for about $5k to build a completely new open source system, based on a FreeBSD beta that had been released during the second week of my research, which included ZFS. It wasn't today's version of ZFS, but it had some features I wanted, and the price was right. The $5k was split between drives, and a drive chassis with extra drive trays. The software I wrote was nothing more than sh scripts, and used Rsync to do all of the main work, and use snapshots to add a day to day granularity in picking files to restore. Now this /was/ a network based backup, but rsync added some magic because the day to day deltas were quite small, so once the system was seeded with the first backup, it was hoped that regular backups would go quickly with little network load.

I had 300 machines to back up, and the first backup took nearly two weeks to complete. The surprise came when I did the second rsync, it completed so fast I was sure it had failed. As it turns out, it was taking less than 5 minutes to run through all of the servers, and complete the backup. The databases on systems that had them were handled differently. Databases, and most log files were excluded. But depending on the nature of each database, a dump was produced, and placed in a folder that was not excluded from the rsync. Some of the dumps were daily, and others were less often.

My script would start each morning by making a snapshot of all of the filesystems on the backup server. It would then initiate the rsync script on each server over ssh. Each server was backed up FILE BY FILE to a directory with the server name, and permissions were managed with NIS, so the server could be accessed by users with the same permissions as the original systems. The daily snapshots were read-only, and used a naming convention that made it easy to find the date you wanted, and when you cd's into the snapshot you saw the list of machine-named-folders, inside of which was a file by file copy of the data from that day.

I started off making daily snapshots, and had planed to trim to weekly after the first 30 days, and then to monthly at some later time, by simply removing snapshots that I wanted to trim. Disk space was being used slower than anticipated, so I never got around to writing the trim script.

The off-site backup requirement was fulfilled by designating one stack of drives in the enclosure as the swap stack, and I would export that filesystem, pack the drives in a pelican case, and leave that with the office manager. The offsite records storage people would come in every Monday with a case of drives, swapping it for the current set. I would then put that set in, and run the script that imported the drives, and ran an rsync to move the most current snapshot data to them. So the offsite sets were limited to weekly only. The offsite sets were 1/3rd the size of the on-site sets, so the expectation was the on-site would have the most day to day granularity, and the offsite DR copy was less granular, but provided a complete DR set.

Rsync and snapshots came together for that job in a way that at the time was surprising. It reduced the bandwidth needed for a network backup enough that direct off-site backup would have been possible. For a fraction of the cost of the off-site records people carrying the drives in and out, I could locate a backup server in a secure datacenter, and add day to day granularity to the off-site space. Or for even less I could locate the off-site storage in a site such as a remote location, or the home of one of the company principals.

Making the first backup was the "expensive" part network and timewise. Once the backup is seeded with one complete set of data, the daily deltas are quite small. If I included log files, the daily deltas would be much greater, as the snapshots would do nothing to collapse disk space inside of the log files. True deduplication would do that in theory, but deduplication is not a sound theory in my opinion.

On my next job, a network of 10 hospitals and 65 clinics, and each hospital has a data center full of vmware servers housing the clinical staff's virtual desktops, each clinician could walk to any computer terminal, log in and see their still running windows desktop. We had terminals in nursing stations, offices, and scattered all over the place on carts. Each of those systems data was stored on a EMC SAN, and in two of the larger hospitals we has an Avamar system running across a mirrored pair of SANs. Integrating Avamar took so long, that the SAN lifecycle was exausted before Avamar was fully implemented. Avamar deduplication was supposed to collapse the dataset to minimal space, because there was huge levels of file level redundancy when you are talking about a bunch of windows boxes... But we hired Avamar themselves to do the work, and they were later acquired by EMC, and with all of the resources of Avamar and EMC working on this, they failed to implement a working Avamar system in the time it took new SAN hardware to become obsolete and be replaced. I accomplished in a few weeks with simple shell scripts a reasonable level of machine by machine level of deduplication that outperformed what EMC was never able to implement with a limited dataset. The hospital didn't run everything through Avamar, they just did one single SAN pair as a proof of concept before attempting a larger scale Avamar system. It would have been a huge win if EMC had managed to get us to double the number of SANs we had, so there were many millions of dollars riding on the success of that proof of concept. Clearly something was proven in that effort, but it was limited to a proven failure. I moved on to another job before I heard of any large scale successes in that system. I did suffer through my enterprise VMware datasets being migrated and completely lost in the process... Only the data I was keeping on local unmanaged storage survived EMC. There was no rollback capability, when my data dropped from the remote SAN, that event was conducted into the production SAN, and it was gone. It was human error, and the criticality of the data was minor, but between that and the apparent impossibility of deduplication across a large dataset makes me wary of people talking about deduplication. I think deduplication works on small data sets, but the combination of rsync and snapshots is very "light weight" and works for many larger datasets quite well, as long as deltas can be isolated as files rather than data inside of a file. But that requires application design upstream of the backup process. That or a rsync system that is application aware. Enterprise databases can do this to some extent...


George
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874

Bigtexun

Dabbler
Joined
Jul 17, 2014
Messages
33
That would probably keep the API aware of configuration changes. The system uses a configuration database, and direct use of the command line can bypass that. For some thing, like the equivalent of changing temporary media it may not be important, but a good freenas add-on should try to use the API. The problem come in if the program is designed to be used with other OS distributions like xBSD, Solaris or Linux among others.
 

gdreade

Dabbler
Joined
Mar 11, 2015
Messages
34
Don't use Perl, please! In today's world Python is a much better choice, Perl is dead.

While your concern is noted, I'm writing this tool not only for use on FreeNAS but other systems as well. In particular, not all target systems have python in base and I'm trying to minimize dependencies.

Plus, most of it was already done before you posted that message and I don't feel like rewriting it :)
 

gdreade

Dabbler
Joined
Mar 11, 2015
Messages
34
The API has the ability to import storage. Could that be helpful?
http://api.freenas.org/resources/storage.html#import

Thanks for the link. Somehow I'd missed the API docs. That answers at least most of the original question.

That said, I think I'm probably *not* going to integrate with the REST API. I think that keeping the (web) user in ignorance of the transient datasets to which the tool is writing is likely to cause fewer problems.
 
Status
Not open for further replies.
Top