To be clear: when I switched between FreeNAS and NAS4Free it was with totally blank hard disks ('dd if=/dev/zero of=/dev/da*'). No criss-crossing.
As it turns out I figured out the problem and recovered the pool. The whole thing was unremarkable and the cause obvious (in hindsight).
Now, I have read loads of pool-recovery threads on here and often the advice is to remove/add disks one by one until a failure disappears/occurs. Based on that I had already tried the following steps but without success.
I noticed that the spontaneous reboot occurred when an attempt was made to mount /dev/da6 and so I removed that drive from the chassis and rebooted. NAS4Free booted up fine.
'zpool import' showed pool01 present but degraded. I successfully imported the pool using 'zpool import -f pool02' (I know using -f is a last-resort but the data on this pool was disposable so I had nothing to lose).
I then put /dev/da6 into a different chassis and booted up the excellent SystemRescueCD Linux distro. 'smartctl -A /dev/da6' showed:
Code:
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 1
So the cause of the spontaneous reboots was a (nother) faulty brand new Western Digital Red (which initially tested fine). Disappointing that such a failure could cause an operating system meltdown. Whether that's down to ZFS or NAS4Free I don't know.
There were files on the pool that I needed and so I recovered those. I could have stuffed a 4TB cold spare drive into the chassis and resilvered but by that point I had changed my plans regarding NAS4Free.
This experience has forced me to acknowledge and act on something that had been bothering me since I decided to switch from FreeNAS to NAS4Free: support in the event of catastrophy.
I hope that everyone picked-up on what happened earlier in this thread. Just a few hours after I had posted news of pool01's demise, cyberjock appears, ready to apply his expertise to recovering my data. That's the FreeNAS forums experience.
For the NAS4Free forums experience, you get pleas for help that (a) go unanswered (b) fizzle out without a solution or (c) "sorry dude ur datas gone"-type reponses:
From thread
is my ZFS pool unrecoverable:
"but i don't think anyone can recover ur pool."
"ZFS pool information is damaged on ur hdds. Try google "zfs fschk"."
And the one that sent a chill down my spine:
"i doubt there is any person on this forum that can help you."
I am
not bashing NAS4Free and I know that the FreeNAS vs NAS4Free comparisons have been done to death. However, I will say that I prefer the NAS4Free interface. The pool-creation steps better match the ZFS literature's description of its component parts. The UI is quick and simple.
However I just cannot ignore the vast difference in community expertise between FreeNAS and NAS4Free. I greatly appreciated the no-nonsense advice received in my
build thread (and then felt guilty when I'd decided to use NAS4Free instead).
When my data is at risk from a ZFS problem then it's the likes of cyberjock and company that I need on my side (what happened to ProtoSD?). That support should be a major consideration for anyone who is evaluating the choice between these two products.
I recommend reading this
epic recovery thread:
Please help! Started up FreeNAS, suddenly voluime storage (ZFS) status unknown?!.
For the idiots that don't have proper backups in place
So just how are the home users here economically backing-up 96TB RAW/64TB actual of data? Or maybe they don't. Invest in a mirror/tapes or take the risk of having no backups.
The chance of losing an entire NTFS filesystem on desktop hardware is just as great
Which, on a different (pro-sumer) "server", I am finding out as yet again just a few days ago a 3TB NTFS disk suddenly becomes corrupted. Eight Memtest86+ passes reveals no bad memory (against my expectations) and now the finger of blame points to those Startech SATA cards... (note: this is not my ZFS server!)
We're big on the ECC and server hardware requirements because you should be running that on any server you're running, not just ZFS based ones.
Again, something I have come to appreciate just lately after a long line of corrupted NTFS disks. Is the ECC memory path constrained to i3s and Xeons?
We live and learn...