HELP ZFS Pool data recovery

Dawson · Jun 19, 2023

HoneyBadger said:
The reason I'm recommending the additional drives is out of an abundance of caution, the fact that it isn't my data at risk, and that your most recent backup is "rather aged" by your own admission. I'm hoping that we can find a way to get you back to at least a point more recent than that.

Before performing any of the clone operations, record serial numbers of the disks and confirm them in the cloning software UI wherever possible.

On bare-metal, if I wanted a simple UI, I'd use something like Clonezilla (possibly combined with a physical write blocker) or a liveCD using dd - since you have Proxmox as a host OS which is Debian-based, you could use dd from Proxmox, but again, be absolutely sure you are specifying the correct source and destination disks. If you DD an entire empty disk onto a good one, there's no walking that back.

Partition table editing is likely going to be a manual effort here, where we gpart list the table from one disk and then manually create it on the other, ditto the label edits. If zdb finds labels on Disk A that's a good sign though.

For data recovery a key principle is "don't ever write back to media you're attempting to recover from" so restoring the old zpool.cache file would be best done if you can restore it to a separate location (network drive) or even if you can use a separate boot device (or Proxmox VM?) to make a fresh TrueNAS install and restore the zpool.cache file to it. (Disconnect your data disks when you're doing it though.)

Regarding the differences, you've already hit the point of using the -FX switch so we're at the point of going manually spelunking for older transaction groups with zdb and then trying to import the pool at that time using -T txg.

Awesome, I will use dd as that seems to be the best one for the job. I will triple-check everything! You will probably have to walk me through the partition table editing process a little more once I get to that point. I'll get another drive ordered right now. As for handling the zpool.cache file I have a fresh TrueNAS on a usb that I can plug into my main PC along with the drives and run it on bare metal.

Dawson · Jun 19, 2023

This will be the new plan.

New Drive arrived:
Drive A - wiped
Drive B - good <-- "failed drive" Outdated information
Drive C - good
Drive D - New Drive
Drive E - New Drive

Method 1:
Drive A - wiped --cloned--> Drive D -->rebuild partition table/gptid labels on Drive A.
Drive B - Resilver *if all goes well.
Drive C - good --cloned--> Drive E

Etorix · Jun 19, 2023

Dawson said:
If you come up with any other ideas or know of anyone who might have more ideas please feel free to post them here.

As a last resort before reverting to the backup from last year, you may give Klenet ZFS Recovery a try. The free version should identify those files which are potentially recoverable from the damaged pool. However an actual recovery operation would require coughing up the licence.

HoneyBadger · Jun 19, 2023

Dawson said:
Awesome, I will use dd as that seems to be the best one for the job. I will triple-check everything! You will probably have to walk me through the partition table editing process a little more once I get to that point. I'll get another drive ordered right now. As for handling the zpool.cache file I have a fresh TrueNAS on a usb that I can plug into my main PC along with the drives and run it on bare metal.

Let's start with the zdb commands, looking at what you've got:

Code:

zdb -l /dev/ada0p2
zdb -l /dev/ada1p2
zdb -l /dev/ada2p2

Dawson · Jun 19, 2023

HoneyBadger said:
Let's start with the zdb commands, looking at what you've got:

Code:
zdb -l /dev/ada0p2 zdb -l /dev/ada1p2 zdb -l /dev/ada2p2

When doing "ada0p2" it says no such file or directory.

Code:

root@truenas[~]# zdb -l /dev/ada0
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
root@truenas[~]# zdb -l /dev/ada1
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
root@truenas[~]# zdb -l /dev/ada2
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
root@truenas[~]#

jgreco · Jun 19, 2023

Dawson said:
When doing "ada1p2" it says no such file or directory.

Try both p1 and p2 on each drive anyways.

If the partition structure is only missing on one disk, we can figure out how to rebuild it. It would suck to bail on this just because one drive was missing a label.

jgreco · Jun 19, 2023

jgreco said:
Try both p1 and p2 on each drive anyways.

Also if you've added any new drives, be aware that FreeBSD device enumeration may not number them in the order you expect, so try "ada3" and "ada4" as well.

jgreco · Jun 19, 2023

winnielinnie said:
EDIT: @jgreco, jinx! You owe me a soda.

Send me your shipping address. You will shortly receive a large box containing a two liter bottle. Ignore the beeping UPS and the shaking of the paint mixing machine that the bottle is packed inside. I assure you it is completely safe to open the bottle and enjoy your soda. Grinch style.

Dawson · Jun 19, 2023

ada0 - no luck. - This is the wiped drive.

Code:

root@truenas[~]# zdb -l /dev/ada1p1
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
root@truenas[~]# zdb -l /dev/ada1p2
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'Tank'
    state: 0
    txg: 2025670
    pool_guid: 2717787786726095806
    errata: 0
    hostid: 1361597103
    hostname: ''
    top_guid: 12486228298157547035
    guid: 3459701388371009720
    vdev_children: 2
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 12486228298157547035
        nparity: 1
        metaslab_array: 74
        metaslab_shift: 34
        ashift: 12
        asize: 11995904212992
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 2470301540868142256
            path: '/dev/gptid/11b39573-ad95-11ed-8d1c-7df9cea98351'
            DTL: 48182
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3459701388371009720
            path: '/dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351'
            DTL: 48181
            create_txg: 4
        children[2]:
            type: 'disk'
            id: 2
            guid: 16273966696595496550
            path: '/dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351'
            DTL: 48180
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1 2 3
root@truenas[~]#

now we're getting somewhere

Code:

root@truenas[~]# zdb -l /dev/ada2p2
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'Tank'
    state: 0
    txg: 2083363
    pool_guid: 2717787786726095806
    errata: 0
    hostid: 1361597103
    hostname: ''
    top_guid: 12486228298157547035
    guid: 16273966696595496550
    vdev_children: 2
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 12486228298157547035
        nparity: 1
        metaslab_array: 74
        metaslab_shift: 34
        ashift: 12
        asize: 11995904212992
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 2470301540868142256
            path: '/dev/gptid/11b39573-ad95-11ed-8d1c-7df9cea98351'
            DTL: 48182
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3459701388371009720
            path: '/dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351'
            not_present: 1
            DTL: 48181
            create_txg: 4
        children[2]:
            type: 'disk'
            id: 2
            guid: 16273966696595496550
            path: '/dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351'
            DTL: 48180
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1 2 3
root@truenas[~]#

Now that ^ should be all for the 4tb drives. the rest below will be logs:

Code:

root@truenas[~]# zdb -l /dev/ada3p1
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    state: 4
    guid: 8292917934443105375
    labels = 0 1 2 3
------------------------------------
L2ARC device header
------------------------------------
    magic: 6504978260106102853
    version: 1
    pool_guid: 2717787786726095806
    flags: 0
    start_lbps[0]: 20566878720
    start_lbps[1]: 20538562560
    log_blk_ent: 1022
    start: 4194816
    end: 34358951936
    evict: 20590295040
    lb_asize_refcount: 66048
    lb_count_refcount: 5
    trim_action_time: 0
    trim_state: 0

------------------------------------
L2ARC device log blocks
------------------------------------
log_blk_count:   1798 with valid cksum
                 0 with invalid cksum
log_blk_asize:   22749184

root@truenas[~]#

Code:

root@truenas[~]# zdb -l /dev/ada4p1
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'Tank'
    state: 0
    txg: 2083363
    pool_guid: 2717787786726095806
    errata: 0
    hostid: 1361597103
    hostname: ''
    top_guid: 13912610063312002762
    guid: 13912610063312002762
    is_log: 1
    vdev_children: 2
    vdev_tree:
        type: 'disk'
        id: 1
        guid: 13912610063312002762
        path: '/dev/gptid/111fa2ca-ad95-11ed-8d1c-7df9cea98351'
        metaslab_array: 73
        metaslab_shift: 29
        ashift: 12
        asize: 34354757632
        is_log: 1
        DTL: 48179
        create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1 2 3
root@truenas[~]#

jgreco · Jun 19, 2023

Dawson said:
ada0 - no luck. - This is the wiped drive.

Ok. So, let's wait for everyone to catch up here and offer opinions.

My feeling is that we need to see if we can re-establish a partitioning scheme on ada0. If you are ordering spare drives, we need to wait for those, and then make a copy of ada0 onto one of them. You then unplug ada0 and set it aside while we hack on the new disk's label.

Dawson · Jun 19, 2023

jgreco said:
Ok. So, let's wait for everyone to catch up here and offer opinions.

My feeling is that we need to see if we can re-establish a partitioning scheme on ada0. If you are ordering spare drives, we need to wait for those, and then make a copy of ada0 onto one of them. You then unplug ada0 and set it aside while we hack on the new disk's label.

Sounds good. I can start the cloning process of ada0 to the blank drive that I currently have right now.

Once I get the OK I'll start the cloning process. (I'll be very careful :) )

Because I'll have the computer out, I'll also document all the S/N on the drives and post it here for safe keeping.

winnielinnie · Jun 19, 2023

Not to go on a tangent, but for the sake of caution and sanity:

Before all of this started, why did you originally believe Drive B was failed/failing? (Did it spit out any errors? Were you alerted via the GUI or an email?)

Part of this whole process should minimize the risk of re-introducing a "potentially" failing drive.

HoneyBadger · Jun 19, 2023

jgreco said:
Ok. So, let's wait for everyone to catch up here and offer opinions.

Catching up. Stand by for effortpost.

@Dawson the good news is that the reason I've been AFK here is that I've been simulating this in my lab, and I've been able to recover from a Linux-issued wipefs against a RAIDZ1 with one drive pulled. It's messy, but it came back intact.

winnielinnie · Jun 19, 2023

Did you do this against one of the drives of a currently imported and active pool? (Or did you issue wipefs against said drive, while the pool was exported? I.e, the drives were available to the system, but no chance of ZFS-related I/O to any of the drives.)

EDIT: Typical forum confusion. This reply was directed at @HoneyBadger, not @Dawson

Dawson · Jun 19, 2023

winnielinnie said:
Not to go on a tangent, but for the sake of caution and sanity:

Before all of this started, why did you originally believe Drive B was failed/failing? (Did it spit out any errors? Were you alerted via the GUI or an email?)

Part of this whole process should minimize the risk of re-introducing a "potentially" failing drive.

The drive showed no issues and suddenly disappeared. I found the reason, it was due to a loose psu cable. The drive is totally fine as far as I know and SMART knows. When I wiped it, it was an imported and active pool. I wiped it with the TrueNAS vm shutdown. Then I booted the vm back up and found out I wiped the wrong drive.

Dawson · Jun 19, 2023

HoneyBadger said:
Catching up. Stand by for effortpost.

@Dawson the good news is that the reason I've been AFK here is that I've been simulating this in my lab, and I've been able to recover from a Linux-issued wipefs against a RAIDZ1 with one drive pulled. It's messy, but it came back intact.

That is so good to hear. Thank you from the bottom of my heart. I won't do anything until I receive instructions from you. Since I feel like you'd know best thanks to the lab testing.

jgreco · Jun 19, 2023

HoneyBadger said:
the good news is that the reason I've been AFK here is that I've been simulating this in my lab,

Best news I've heard today. One less thing I have to do.

HoneyBadger · Jun 19, 2023

Assumptions: All of your three drives are identical models. If we need to play with the partition sizes I'll need to do more.

In the examples below:

ada0 is "Drive A" that got the Proxmox wipe.
ada1 is "Drive B" that we thought failed, but it was just loose cabling and it's now recovered.
ada2 is "Drive C" that's the last disk standing.

You can see from ada1p2 that the last transaction group committed was 2025670 and ada2p2 is 2083363 - assuming you basically hovered around the 5-second default txg timeout period that's around 80 hours.

Confirm with serial numbers that the order of these drives hasn't changed.
So, step zero is underway - clone Drive A to Clone A.

Assuming order has been maintained ada0 should have label 11b39573-ad95-11ed-8d1c-7df9cea98351

So, here's what we're going to do.

Finish your clone of Drive A to Clone A. Pull the original Drive A, set it aside, and replace it with Clone A. Get it presented back to the system in the exact same way. Instructions are in the spoiler.

Good. Buckle up.

Again, confirm with serial numbers that the order of these drives hasn't changed. We don't want to target the wrong drives.

Check the partition table on Drive C with
gpart backup ada2
If it looks good, and has output like below

Code:

GPT 128
1   freebsd-swap      128  4194304
2    freebsd-zfs  4194432 12582744

If it looks similar to the above (but with a way bigger number at the end) then:

Clone the partition table from Drive C to Clone A with
gpart backup ada2 | gpart restore ada0

Check the partition table on Clone A with
gpart backup ada0
It should be identical (same model drives, same partition layout)

See if you get an output from zdb -l ada0p2 now. If you do, then this is a good thing - check the txg number near the top. Hopefully it's closer to ada2p2's 2083363 than the older ada1p2 number.

Rewrite the missing GPTID of 11b39573-ad95-11ed-8d1c-7df9cea98351 to Clone A with
gpart modify -i2 -l 11b39573-ad95-11ed-8d1c-7df9cea98351 ada0

Reboot. Go back to the command line and check the results of zpool import which will hopefully give you the pool available for import:

Code:

root@freenas-lab[~]# zpool import
   pool: recoverme
     id: 9933807979428463458
  state: DEGRADED
status: One or more devices are missing from the system.
 action: The pool can be imported despite missing or damaged devices.  The
        fault tolerance of the pool may be compromised if imported.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
 config:

You'll probably need to do zpool import -F or -FX as well.

HoneyBadger · Jun 19, 2023

winnielinnie said:
Did you do this against one of the drives of a currently imported and active pool? (Or did you issue wipefs against said drive, while the pool was exported? I.e, the drives were available to the system, but no chance of ZFS-related I/O to any of the drives.)

EDIT: Typical forum confusion. This reply was directed at @HoneyBadger, not @Dawson

Clobbered them live with wipefs -af from a separate SCSI initiator. TrueNAS CORE machine wouldn't export the pool, required me to force a shutdown, reboot, and then go about the label rebuild described above.

Dawson · Jun 19, 2023

HoneyBadger said:
Assumptions: All of your three drives are identical models. If we need to play with the partition sizes I'll need to do more.

In the examples below:

ada0 is "Drive A" that got the Proxmox wipe.
ada1 is "Drive B" that we thought failed, but it was just loose cabling and it's now recovered.
ada2 is "Drive C" that's the last disk standing.

You can see from ada1p2 that the last transaction group committed was 2025670 and ada2p2 is 2083363 - assuming you basically hovered around the 5-second default txg timeout period that's around 80 hours.

Confirm with serial numbers that the order of these drives hasn't changed.
So, step zero is underway - clone Drive A to Clone A.

Assuming order has been maintained ada0 should have label 11b39573-ad95-11ed-8d1c-7df9cea98351

So, here's what we're going to do.

Finish your clone of Drive A to Clone A. Pull the original Drive A, set it aside, and replace it with Clone A. Get it presented back to the system in the exact same way. Instructions are in the spoiler.

Good. Buckle up.

Again, confirm with serial numbers that the order of these drives hasn't changed. We don't want to target the wrong drives.

Check the partition table on Drive C with
gpart backup ada2
If it looks good, and has output like below

Code:
GPT 128 1 freebsd-swap 128 4194304 2 freebsd-zfs 4194432 12582744

If it looks similar to the above (but with a way bigger number at the end) then:

Clone the partition table from Drive C to Clone A with
gpart backup ada2 | gpart restore ada0

Check the partition table on Clone A with
gpart backup ada0
It should be identical (same model drives, same partition layout)

See if you get an output from zdb -l ada0p2 now. If you do, then this is a good thing - check the txg number near the top. Hopefully it's closer to ada2p2's 2083363 than the older ada1p2 number.

Rewrite the missing GPTID of 11b39573-ad95-11ed-8d1c-7df9cea98351 to Clone A with
gpart modify -i2 11b39573-ad95-11ed-8d1c-7df9cea98351 ada0

Reboot. Go back to the command line and check the results of zpool import which will hopefully give you the pool available for import:

Code:
root@freenas-lab[~]# zpool import pool: recoverme id: 9933807979428463458 state: DEGRADED status: One or more devices are missing from the system. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q config:

You'll probably need to do zpool import -F or -FX as well.

Will do. They're all the same Toshiba MG03ACA400 4tb drives. I will start cloning then!

Important Announcement for the TrueNAS Community.

HELP ZFS Pool data recovery

Explorer

Explorer

Wizard

actually does care

Explorer

Resident Grinch

Resident Grinch

Resident Grinch

Explorer

Resident Grinch

Explorer

MVP

actually does care

MVP

Explorer

Explorer

Resident Grinch

actually does care

actually does care

Explorer

Similar threads