Controller ate my ZFS pool

Status
Not open for further replies.

[XAP]Bob

Dabbler
Joined
Jan 6, 2013
Messages
21
I have a FreeNAS, which I use as a "Time Machine" for the mac in the house, as well as video storage, and general backups etc. It uses an 8GB USB stick to boot, and four*750GB HDDs configured as a raidZ1 pool.

I have more HDDs available, but limited SATA ports. So I bought an 8 port card (dirt cheap from ebay) and decided to switch over to using it - with the intention of changing PC case and adding extra drives later.

Oh dear.

I shouldn't have done that.

I *should* have tried all new drives in the new controller, but I didn't, I used my existing pool drives.

I don't know what the controller has done to them, but now they aren't recognised as the pool any more :(

System:
Build FreeNAS-9.1.0-RELEASE-x64 (dff7d13)
Platform AMD Athlon(tm) II X4 640 Processor
Memory 7915MB


After a while sulking I went searching, and found http://serverfault.com/questions/297029/zfs-on-freebsd-recovery-from-data-corruption which looks similar enough to give me some optimism.

Code:
[root@FreeNAS] ~# zdb
Data:
    version: 5000
    name: 'Data'
    state: 0
    txg: 4956163
    pool_guid: 15585826249507765244
    hostid: 2429217988
    hostname: ''
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 15585826249507765244
        create_txg: 4
        children[0]:
            type: 'raidz'
            id: 0
            guid: 5708209934565116186
            nparity: 1
            metaslab_array: 34
            metaslab_shift: 34
            ashift: 12
            asize: 2244012146688
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 6161345816377449791
                path: '/dev/gptid/64f8ba3a-1029-11e3-8a56-14dae9eaddf1'
                phys_path: '/dev/gptid/64f8ba3a-1029-11e3-8a56-14dae9eaddf1'
                whole_disk: 1
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 8929158312757003696
                path: '/dev/gptid/6560d59b-1029-11e3-8a56-14dae9eaddf1'
                phys_path: '/dev/gptid/6560d59b-1029-11e3-8a56-14dae9eaddf1'
                whole_disk: 1
                create_txg: 4
            children[2]:
                type: 'disk'
                id: 2
                guid: 2814370025582819070
                path: '/dev/gptid/65c95d49-1029-11e3-8a56-14dae9eaddf1'
                phys_path: '/dev/gptid/65c95d49-1029-11e3-8a56-14dae9eaddf1'
                whole_disk: 1
                create_txg: 4
    features_for_read:


Disturbingly this only lists 3 HDDs, despite all four having been OK previous to the reboot. I also wonder if this is reading some zfs cache file, rather than actual disks...

Code:
[root@FreeNAS] ~# zdb -lll /dev/ada0
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
 
failed to unpack label 3


That doesn't look good either...


Is there any likelihood of recovering this mess?
Any pointers as to the right direction to take from here? Starting to get completely disillusioned with computers again :(
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
Some more info will be needed... What kind of controler did you get?
SATA/SAS Host bus adapter (usually ok), a Raid Card (not good) ? Any labels?

And have you done any modifications except moving the disks? Maybe there is just a loose cable or power connection on the missing drive?
If there are no other modifications you could always move back to your old 4 port controller (mainboard?)
 

[XAP]Bob

Dabbler
Joined
Jan 6, 2013
Messages
21
LSI raid controller (if memory serves it's an HP (somethingy)400)

I didn't configure anything on it and have pulled the disks back to the Mobo.
dmesg sees all four disks.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
LSI makes a lot of controllers, so more detail would help. However, if you can see the disks when they're directly connected to the motherboard, that's a good sign. Can FreeNAS see them? What does it say? If the pool doesn't come up on its own, can you import it?
 

[XAP]Bob

Dabbler
Joined
Jan 6, 2013
Messages
21
FreeNAS doesn't see them, nor import a pool.
import from the command line doesn't work either
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
post the output of:

# camcontrol devlist
# zpool import
# zpool status

Do this in pastebin, not the forums!
 

[XAP]Bob

Dabbler
Joined
Jan 6, 2013
Messages
21
Normally I'd respect the pastebin request - but there really isn't much text:
Code:
[root@tardis] ~# camcontrol devlist
<Hitachi HUA721075KLA330 GK8OAB0A>  at scbus0 target 0 lun 0 (ada0,pass0)
<Hitachi HUA721075KLA330 GK8OAB0A>  at scbus1 target 0 lun 0 (ada1,pass1)
<Hitachi HUA721075KLA330 GK8OAB0A>  at scbus2 target 0 lun 0 (ada2,pass2)
<Hitachi HUA721075KLA330 GK8OAB0A>  at scbus3 target 0 lun 0 (ada3,pass3)
<SanDisk Cruzer Fit 1.26>          at scbus6 target 0 lun 0 (pass4,da0)
[root@tardis] ~# zpool import
[root@tardis] ~# zpool status
no pools available


Just popped into the server loft - LSI Smart Array P400
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah, I'd say you have 3 possibilities at this point:

1. Your old controller used the disks in a hardware RAID so you have to use those disks with that controller forever(this is why we tell people to use HBAs and NOTHING else).
2. You've done something to wipe out critical data on the disks. Not sure how you might have 'accidentally' done this but listing it for completeness.
3. Your new controller sucks balls.

I'm very wary of anything that is HP, Dell, etc. We recommend the M1015 and that's all that I generally recommend. Others may have weird quirks that make using them in FreeNAS a bad idea. So my advice is if your data is important to get an M1015 and reflash it to IT mode.

I'm assuming you've plugged the disks back into the old controller and been disappointed with the result?
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
Just to confirm... You did not do any export or detach or anything. You just plugged the drives to the new controller?

Then you did not do anything (in BIOS or GUI or anywhere) and upon not seeing your pool, you had moved drives back to the original SATA ports? I.e. you only looked int he GUI and did only the commands you had shown to us...

Are you booting from the same USB memory device? No changes with that?
 

[XAP]Bob

Dabbler
Joined
Jan 6, 2013
Messages
21
Same USB stick, although it decided against booting at one point, so got moved to a different port on the rear (can't remember the exact point this happened).
No explicit export/detach at any point.

I powered down, unplugged the main power cable, moved the disks, re-plugged in and powered up.

No pools found - I would have tried an auto import.
Likely repeated the boot/import at least twice.

Then same process to power down and move the disks back, power up and auto-import
Powered down and went and sulked for a few hours - then tried the commands listed above.

I've tried not do anything that looked like it would rewrite disks - basically playing the "if I don't touch it it shouldn't get worse" game.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
Try a new installation of FreeNAS on a good, known to be working USB. If the old one was OK, then it should not randomly stop working
 

[XAP]Bob

Dabbler
Joined
Jan 6, 2013
Messages
21
Everything other than importing the pools works on the existing stick. The machine decided it couldn't find a boot drive until I moved it to a different port. I'm reasonably confident that there isn't anything seriously wrong with the install.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
There are 4, not mutually exclusive, possibilities:
  1. your pool is somehow gone, maybe thank to the same gremlins that played with your boot sequence;
  2. really weird hardware failure manifesting in inability to properly read disks and keep the boot sequence;
  3. USB memory device corruption, that confused both zdb and zpool imports;
  4. FreeNAS bug.
Items 3. and 4. can be easily tested with a new FreeNAS on a new USB memory device. You do not need to change anything with your old configuration, just validate your pool (by importing, and possibly running a scrub).

Item 2. is beyond my troubleshooting abilities. Often recommended route starts with taking hard drives to another, known to be working, FreeNAS system.

Item 1., I am still hopeful, please execute zpool import -D
 
Status
Not open for further replies.
Top