ZFS pool recovery

Status
Not open for further replies.

Terotgut

Cadet
Joined
May 27, 2013
Messages
5
Hi! my freenas is installed as a virtual machine on xenserver.
It has a two 2tb hard drives ( in my case- virtual hard drives). After power loss and rebooting of the server, my share become not available.

When system starting iit shows message "GEOM: ada2: the secondary GPT table is corrupt or invalid. GEOM: ada2: using the primary only -- recovery suggested."
I was trying to recover the GPT table by "gpart recover ada2", it returns input/output error.
When I perform "zpool import" it returns error http://illumos.org/msg/ZFS-8000-6x

zdb -l for broken drive shows "failed to read label 0-3" ( for other drive it is OK)

Please help to recover my pool. If you need additional information - I can provide it.
 

Terotgut

Cadet
Joined
May 27, 2013
Messages
5
I think, main problem is in reading of zfs labels. May be someone knows, how can I recover it?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Read this thread... this is a test with your attention to details...

GPT table is corrupt or invalid error.

Note that the thread above fixes your GPT issue only. If your zpool won't mount you have other problems that will also need to be addressed after you fix your GPT table. Considering that your GPT is corrupted AND the zpool is unmountable, my guess is that recovery will be difficult, if at all possible. You provided no information on why its not mounting, so I am making some assumptions. I will tell you that people have had unrecoverable zpools because of an improper shutdown of the server. That's why servers should ALWAYS ALWAYS ALWAYS have an UPS to provide the power for a shutdown.
 

Terotgut

Cadet
Joined
May 27, 2013
Messages
5
Thank you for answer. You say "I did a reboot and all is well. So I guess the correct command (at least for my situation) was gpart recover /dev/da7 as root. Hopefully someone will find this useful in the future. I didn't have to do anything except run this command. No sysctl parameters or anything."

But when i try "gpart recover /dev/ada2" as root, i just get "Input/output error"

Pool is not mounting, because there are errors in zfs labels of one of the disks. When i try "zpool import" or "zpool import -f" i get en error "One or more devices are missing from the system."
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Can you post the output of the following commands inside [CODE][/CODE]


gpart list

gpart show

zpool import

smartctl -a -q noserial /dev/ada2

camcontrol devlist

glabel status
 

Terotgut

Cadet
Joined
May 27, 2013
Messages
5
Code:
root@mfsbsd:/root # gpart list
Geom name: ada0
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 75497471
first: 63
entries: 4
scheme: MBR
Providers:
1. Name: ada0s1
   Mediasize: 988291584 (942M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 32256
   Mode: r0w0e0
   attrib: active
   rawtype: 165
   length: 988291584
   offset: 32256
   type: freebsd
   index: 1
   end: 1930319
   start: 63
2. Name: ada0s2
   Mediasize: 988291584 (942M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 988356096
   Mode: r0w0e0
   rawtype: 165
   length: 988291584
   offset: 988356096
   type: freebsd
   index: 2
   end: 3860639
   start: 1930383
3. Name: ada0s3
   Mediasize: 1548288 (1.5M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 1976647680
   Mode: r0w0e0
   rawtype: 165
   length: 1548288
   offset: 1976647680
   type: freebsd
   index: 3
   end: 3863663
   start: 3860640
4. Name: ada0s4
   Mediasize: 21159936 (20M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 1978195968
   Mode: r0w0e0
   rawtype: 165
   length: 21159936
   offset: 1978195968
   type: freebsd
   index: 4
   end: 3904991
   start: 3863664
Consumers:
1. Name: ada0
   Mediasize: 38654705664 (36G)
   Sectorsize: 512
   Mode: r0w0e0

Geom name: ada1
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 3774873566
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada1p1
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 65536
   Mode: r0w0e0
   rawuuid: 256c011a-6187-11e2-b8e4-e2ffd51c04f3
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 2147483648
   offset: 65536
   type: freebsd-swap
   index: 1
   end: 4194431
   start: 128
2. Name: ada1p2
   Mediasize: 1930587717120 (1.8T)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 2147549184
   Mode: r0w0e0
   rawuuid: 25822b80-6187-11e2-b8e4-e2ffd51c04f3
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 1930587717120
   offset: 2147549184
   type: freebsd-zfs
   index: 2
   end: 3774873566
   start: 4194432
Consumers:
1. Name: ada1
   Mediasize: 1932735283200 (1.8T)
   Sectorsize: 512
   Mode: r0w0e0

Geom name: ada2
modified: false
state: CORRUPT
fwheads: 16
fwsectors: 63
last: 3774873566
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada2p1
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 65536
   Mode: r0w0e0
   rawuuid: 25d887ce-6187-11e2-b8e4-e2ffd51c04f3
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 2147483648
   offset: 65536
   type: freebsd-swap
   index: 1
   end: 4194431
   start: 128
2. Name: ada2p2
   Mediasize: 1930587717120 (1.8T)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 2147549184
   Mode: r0w0e0
   rawuuid: 25e0ae96-6187-11e2-b8e4-e2ffd51c04f3
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 1930587717120
   offset: 2147549184
   type: freebsd-zfs
   index: 2
   end: 3774873566
   start: 4194432
Consumers:
1. Name: ada2
   Mediasize: 1932735283200 (1.8T)
   Sectorsize: 512
   Mode: r0w0e0

Geom name: ada0s1
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 1930256
first: 0
entries: 8
scheme: BSD
Providers:
1. Name: ada0s1a
   Mediasize: 988283392 (942M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 40448
   Mode: r0w0e0
   rawtype: 0
   length: 988283392
   offset: 8192
   type: !0
   index: 1
   end: 1930256
   start: 16
Consumers:
1. Name: ada0s1
   Mediasize: 988291584 (942M)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 32256
   Mode: r0w0e0


Code:
root@mfsbsd:/root # gpart show
=>      63  75497409  ada0  MBR  (36G)
        63   1930257     1  freebsd  [active]  (942M)
   1930320        63        - free -  (31k)
   1930383   1930257     2  freebsd  (942M)
   3860640      3024     3  freebsd  (1.5M)
   3863664     41328     4  freebsd  (20M)
   3904992  71592480        - free -  (34G)

=>        34  3774873533  ada1  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3770679135     2  freebsd-zfs  (1.8T)

=>        34  3774873533  ada2  GPT  (1.8T) [CORRUPT]
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3770679135     2  freebsd-zfs  (1.8T)

=>      0  1930257  ada0s1  BSD  (942M)
        0       16          - free -  (8.0k)
       16  1930241       1  !0  (942M)


Code:
root@mfsbsd:/root # glabel status
                                      Name  Status  Components
                    ufsid/516c5efa6b8b4567     N/A  md0
                            iso9660/MFSBSD     N/A  cd0
                    ufsid/5165e882cc16a9c0     N/A  ada0s3
                             ufs/FreeNASs3     N/A  ada0s3
                    ufsid/5165e8828a5109ff     N/A  ada0s4
                             ufs/FreeNASs4     N/A  ada0s4
gptid/256c011a-6187-11e2-b8e4-e2ffd51c04f3     N/A  ada1p1
gptid/25822b80-6187-11e2-b8e4-e2ffd51c04f3     N/A  ada1p2
gptid/25d887ce-6187-11e2-b8e4-e2ffd51c04f3     N/A  ada2p1
gptid/25e0ae96-6187-11e2-b8e4-e2ffd51c04f3     N/A  ada2p2
                    ufsid/5165e77974bdc41f     N/A  ada0s1a
                            ufs/FreeNASs1a     N/A  ada0s1a


Code:
root@mfsbsd:/root # camcontrol devlist
<QEMU HARDDISK 0.10.2>             at scbus0 target 0 lun 0 (ada0,pass0)
<QEMU HARDDISK 0.10.2>             at scbus0 target 1 lun 0 (ada1,pass1)
<QEMU HARDDISK 0.10.2>             at scbus1 target 0 lun 0 (ada2,pass2)
<QEMU QEMU DVD-ROM 0.10>           at scbus1 target 1 lun 0 (cd0,pass3)


Code:
root@mfsbsd:/root # smartctl -a -q noserial /dev/ada2
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE-p2 i386] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     QEMU HARDDISK
Firmware Version: 0.10.2
User Capacity:    1,932,735,283,200 bytes [1.93 TB]
Sector Size:      512 bytes logical/physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA/ATAPI-7, ATA/ATAPI-5 published, ANSI NCITS 340-2000
Local Time is:    Thu Jun  6 03:17:57 2013 UTC
SMART support is: Unavailable - device lacks SMART capability.

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

Code:
root@mfsbsd:/root # zpool import
   pool: share
     id: 5432057381954189926
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://illumos.org/msg/ZFS-8000-6X
 config:

        share                                         UNAVAIL  missing device
          gptid/25822b80-6187-11e2-b8e4-e2ffd51c04f3  ONLINE

        Additional devices are known to be part of this pool, though their
        exact configuration cannot be determined.



P.S. smartctl is not informative because FreeNas is a virtual machine on xenserver.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Here's where I tell you "OMFG".

Because you are running a virtual machine you are instantly out of the scope of my ability to help. In fact, if you had mentioned you were using a virtual machine for FreeNAS before you would have gotten about 5 people that would have responded with "good F'in luck getting your data back". I probably wouldn't have even responded if you had said you had used virtualization. It's almost like the dummy's penalty for not following the recommendations for use of ZFS. It's not that people don't want to help you, its that all options for recovery go away the second someone uses virtualization. Any kind of troubleshooting needs to be done on the bare metal and then the recovery actions need to be done on the bare metal considering the actual configuration of your virtual machines, then adjusted accordingly to take into account your configuration and your virtual machines, then applied properly inside the virtual machine itself. That's something that can't be done reasonably over a forum or a remote session. It's something only the administrator will know, and if it wasn't well planned ahead of time there may not be a recovery option.

My guess.. the real disk that has the virtual disk ada2 on it is failing, has failed, or the virtualization is failing, or has failed, or you have a software conflict in the hypervisor itself causing the problem. The possible causes are almost endless because of the significantly more complex layer of virtualization. Because you virtualized FreeNAS and ZFS(which is a major no-no) ZFS is confused as all hell because it doesn't have enough information to realize that ada2 is broken.

At this point the best idea I can come up with is to figure out which physical disk "ada2" is actually stored on and run some SMART tests. I'd wager the disk is failed, but FreeNAS has no way of identifying and/or proving it because you virtualized it. If the physical disk "ada2" is on is bad you can try a recovery using something like ddrescue in FreeBSD or Linux on a different machine to copy the contents of the bad disk to a spare good one. With some luck you might be able to get some of your data back. But good luck in any case because I don't have any other ideas. There's a reason why there's threads titled things like Please do not run FreeNAS in production as a Virtual Machine! and then threads like "Absolutely must virtualize FreeNAS!" ... a guide to not completely losing your data. The last link posted says "I am not aware of specific issues that would prevent Xen from being suitable. There is some debate as to the suitability of KVM. You are in uncharted waters if you use these products." Both of those threads are the product of more than a dozen people losing everything because they thought they were so smart by virtualizing FreeNAS and ZFS. Unfortunately, I can't even trust that the I/O error message you are getting is truely an I/O error because you are virtualizing. Basically every error at this point in the virtual machine is not necessarily what it seems.

ZFS really truly needs bare metal. Virtualizing removes a lot of the technological solutions ZFS is supposed to solve for redundancy and adds a lot of complexity that the server administrator is responsible for planning for, troubleshooting, and resolving on his own.

Unfortunately, you are truly on your own to fix it :( Good luck. I'd keep a record of what you do and what you find. If you do manage to get your data back it might be helpful to post what you did. I've never used or seen anything except VMWare products. Personally, I don't think I've seen anyone use virtualization and actually recover their data. :/ There is one forum member that may have some advice, but I believe his advice will be "recover for backup or give up". I also won't give his name because he's so incredibly tired of people not being smart with Virtualization he doesn't generally even help people anymore. It's really something he's had to deal with weekly(sometimes twice a week) for months. Yes, that many people end up losing their data because of virtualizing.

Edit: The one person with alot of virtualizing experience, if he chooses to show himself, will probably post or PM you in the next 24-48 hours. If not, then you are on your own. You may want to try to run diagnostics on your hard drives(nothing destructive of course) just in case he decides to help you. Naturally those hard drive diagnostics may at least rule out a hard drive failure even if you can't get your data back.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
One thing I'd like to add... Fixing the GPT is not really the problem, and even after you fix it you still will have to make the zpool mountable. The GPT is more of a symptom that there are serious things wrong than the solution to your problems. The error message you get that says "GEOM: ada2: the secondary GPT table is corrupt or invalid. GEOM: ada2: using the primary only -- recovery suggested." means that while 1 copy was bad the other was good. The system is using the good copy and recommending you correct the backup(bad) copy.

My guess is that whatever happened to corrupt your GPT table is also the cause for your zpool woes.

Good luck!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Sorry I don't have a better answer for you. Virtualization is one of those things where when it works its awesome. When it doesn't work it tends to not work at all.
 
Status
Not open for further replies.
Top