HELP ZFS Pool data recovery

jgreco · Jun 18, 2023

winnielinnie said:
The way Proxmox "wiped" the drive

I should write a book, "101 reasons not to invent your own virtualization strategy." Sigh.

Dawson · Jun 18, 2023

winnielinnie said:
You accidentally "wiped" a good drive in the RAIDZ1 vdev of a pool that was already degraded.

I'm not sure why "zpool import" is suggesting that two drives are available, when reality you only have one available.

First...

RAIDZ1 (HEALTHY):
Drive A - good
Drive B - good
Drive C - good

Then...

RAIDZ1 (DEGRADED):
Drive A - failed
Drive B - good
Drive C - good

Then...

RAIDZ1 (DEAD):
Drive A - failed
Drive B - wiped <--- the mistake you made when wiping via Proxmox
Drive C - good

What's throwing everyone off is that "zpool import" without any flags suggests that two drives are healthy and available, which implies you can import the pool in a degraded state.

Yet this is not true.

Why is "zpool import"... lying?

My super amateur low-IQ shot in the dark: The way Proxmox "wiped" the drive perhaps just erased a portion of it at the start of the drive? Yet there still remains zpool metadata at the end of the drive, which zpool import detects?

Honestly, this is getting into low-level stuff, and I'm just shooting in the wind.

It was kinda a confusing thing to follow. So what happened was one of my drives disappeared so I thought it was dead. I bought a new one, took out the old, and slapped in the new. I when to format the new drive, and I accidentally formatted the wrong one. So at the time, there was only one drive that was good. and then I had the new drive and the drive I accidentally wiped. so I took out the new drive and put in the one I thought was dead, and it showed up. Now I have two good drives and one that is wiped. Hope this clears it up. So zpool import isn't lying

jgreco · Jun 18, 2023

How long did it take to "format" or "wipe" the drive?

winnielinnie · Jun 18, 2023

I honestly don't even know what "wipe" means in the context of Proxmox. (What does it use behind the GUI when you invoke this tool?)

@Dawson: How long did the "wipe" take?

EDIT: @jgreco, jinx! You owe me a soda.

Dawson · Jun 18, 2023

jgreco said:
I should write a book, "101 reasons not to invent your own virtualization strategy." Sigh.

I would totally run it on bare metal, but it is just for my home server, and I don't want to spend money on 2 servers. It's been working flawlessly until now. :( I need this server PC to run other vms like home assistant, pie hole, and some game servers that I host.

Dawson · Jun 18, 2023

jgreco said:
How long did it take to "format" or "wipe" the drive?

Instant.

Dawson · Jun 18, 2023

winnielinnie said:
I honestly don't even know what "wipe" means in the context of Proxmox. (What does it use behind the GUI when you invoke this tool?)

@Dawson: How long did the "wipe" take?

EDIT: @jgreco, jinx! You owe me a soda.

Little research turned this up: PVE uses "wipefs and "dd if=/dev/zero"

winnielinnie · Jun 18, 2023

Dawson said:
Instant.

That's probably why the "wiped" drive is still detected as "available" when you run "zpool import" without any flags.

Looks like it just destroyed the beginning of the drive. Probably a way to clear out any partition or filesystem info and/or writes random data to the first hundred MiB of the drive.

winnielinnie · Jun 18, 2023

Dawson said:
I would totally run it on bare metal, but it is just for my home server, and I don't want to spend money on 2 servers.

At this point, it might be too late. You may have in fact killed (accidentally) your pool by issuing Proxmox's "wipe" tool against the wrong drive before you did the replacement of the actual failed drive. (See my color-coding in an earlier post of the series of events. I'm only going by what you said in regards of "I wiped the wrong drive.")

EDIT: Need to run. But I'm out of ideas, anyways.

Dawson · Jun 18, 2023

I don't think you follow. I still have two good drives. the drive I thought failed, didn't actually fail. it was the result of a loose psu cable. Here's what happened:

RAIDZ1 (HEALTHY):
Drive A - good
Drive B - good
Drive C - good

Then...

RAIDZ1 (DEGRADED):
Drive A - good
Drive B - failed <---- *so i thought
Drive C - good

Then...

RAIDZ1 (DEAD):
Drive A - wiped <--- the mistake Imade when wiping via Proxmox
Drive D - new drive
Drive C - good

RAIDZ1 (DEGRADED): <---- current state
Drive A - wiped
Drive B - good <-- replaced new drive with old "failed drive" (that didn't actually fail)
Drive C - good

Hope that clears it up. We still have 2 good working drives with data on them. and I just don't know why we can't import them. Is there any 3rd party software that I could try to import the raidz array?

Dawson · Jun 18, 2023

I have also unplugged the drive that I 100% know I wiped from the system, zpool import shows the same thing

jgreco · Jun 18, 2023

Dawson said:
I would totally run it on bare metal, but it is just for my home server, and I don't want to spend money on 2 servers.

That's fine, but you need to do it correctly.

It's been working flawlessly until now.

That's the thing I hear all the time. It works flawlessly until it suddenly doesn't. That can be either a technical server issue of some sort (PCIe passthru flakes out) or an operational issue facilitated and encouraged by some design error (for you, right now, "I erased the wrong drive with Proxmox", to which the question is, "why the HELL did Proxmox have any access to the drives?")

I say it all the time, but the measure of success is NOT "I got it to do this thing that I wanted to do" but rather "I used a strategy that has considered how to mitigate numerous potential failure modes and has been successfully used by thousands of people". This is intended as constructive criticism because it is coming closer and closer to looking like you will not be recovering your data, so we can look forward to a better design for your next TrueNAS.

Even if one of the other ZFS wizards hanging around here manages to incant magic to fix you, the failure here is bad.

:( I need this server PC to run other vms like home assistant, pie hole, and some game servers that I host.

Yes, and maybe Proxmox is okay for that, but you could also run the VMs directly under SCALE as well. SCALE would work well for a handful of VMs. If you're going to run SCALE under Proxmox, you really need to follow the virtualization guidance I posted earlier in this thread. It's a pick-yer-poison sort of decision and I don't have a horse in that race. If Proxmox will be stable on your platform, then you're probably good to go that way if you wish. But then you need to use PCIe passthru for a HDD controller of some sort.

Dawson · Jun 18, 2023

jgreco said:
That's fine, but you need to do it correctly.

That's the thing I hear all the time. It works flawlessly until it suddenly doesn't. That can be either a technical server issue of some sort (PCIe passthru flakes out) or an operational issue facilitated and encouraged by some design error (for you, right now, "I erased the wrong drive with Proxmox", to which the question is, "why the HELL did Proxmox have any access to the drives?")

I say it all the time, but the measure of success is NOT "I got it to do this thing that I wanted to do" but rather "I used a strategy that has considered how to mitigate numerous potential failure modes and has been successfully used by thousands of people". This is intended as constructive criticism because it is coming closer and closer to looking like you will not be recovering your data, so we can look forward to a better design for your next TrueNAS.

Even if one of the other ZFS wizards hanging around here manages to incant magic to fix you, the failure here is bad.

Yes, and maybe Proxmox is okay for that, but you could also run the VMs directly under SCALE as well. SCALE would work well for a handful of VMs. If you're going to run SCALE under Proxmox, you really need to follow the virtualization guidance I posted earlier in this thread. It's a pick-yer-poison sort of decision and I don't have a horse in that race. If Proxmox will be stable on your platform, then you're probably good to go that way if you wish. But then you need to use PCIe passthru for a HDD controller of some sort.

Definitely, many lessons were learned over this whole ordeal. I will be taking your advice. I greatly appreciate your help and time through this. If you come up with any other ideas or know of anyone who might have more ideas please feel free to post them here. I will leave the system as is for the next couple of days in case of that. Otherwise, I will bite the bullet and start over.

joeschmuck · Jun 18, 2023

Something you have not tried is TrueNAS Core. While I do not think you will have better luck, there is a remote chance it will work, very remote. Maybe you can mount the pool. But before you accept there is no data recovery, give it a try. I'd also just do it on bare metal for the heck of it. You can create a bootable USB Flash Drive for this and disconnect all drives not part of the pool. I would also remove the "formatted" drive and install the new blank drive. The formatted drive is useless in my opinion for data recovery unless you desire to pay big money to have a company recover the data for you, if they can that is. Again, this is what I'd do but I try to think outside the box.

Some advice for the future: If you are going to do something risky to a drive, remove the good drives first. Keep the good drives safe. Trust me, we all have had our fair share of mistakes and loosing data. My first lesson was at age 16 (back in the 1970's). I had several other lessons unfortunately after that but I haven't had one in many years (knock on wood) and I hope to never run into another data loss due to me not being careful.

Dawson · Jun 18, 2023

joeschmuck said:
Something you have not tried is TrueNAS Core. While I do not think you will have better luck, there is a remote chance it will work, very remote. Maybe you can mount the pool. But before you accept there is no data recovery, give it a try. I'd also just do it on bare metal for the heck of it. You can create a bootable USB Flash Drive for this and disconnect all drives not part of the pool. I would also remove the "formatted" drive and install the new blank drive. The formatted drive is useless in my opinion for data recovery unless you desire to pay big money to have a company recover the data for you, if they can that is. Again, this is what I'd do but I try to think outside the box.

Some advice for the future: If you are going to do something risky to a drive, remove the good drives first. Keep the good drives safe. Trust me, we all have had our fair share of mistakes and loosing data. My first lesson was at age 16 (back in the 1970's). I had several other lessons unfortunately after that but I haven't had one in many years (knock on wood) and I hope to never run into another data loss due to me not being careful.

I appreciate the advice. I have already tired this

winnielinnie · Jun 18, 2023

Dawson said:
RAIDZ1 (DEAD):
Drive A - wiped <--- the mistake Imade when wiping via Proxmox
Drive D - new drive
Drive C - good

Something might have happened here, since you did this on an active pool. Even though you didn't touch the last remaining good drive (Drive C), ZFS may have been attempting to write or modify data and metadata when you nuked Drive A. (Unless I'm misinterpreting the steps you took.) Besides, anything disk-related should have been done in TrueNAS anyways, but that's another story and @jgreco explains it more in depth.

So now you have a "good" Drive C, a useless Drive D, and a previously good Drive A which you nuked while the pool was active.

Even though you re-inserted Drive B (which supposedly never had issues?) back into the mix, ZFS data and metadata could have been writing/modifying when you were nuking Drive A. This probably is why you cannot import the pool with Drive B and C, even though the drives are in working condition. (This doesn't even consider that Drive B may in fact still be wonky. You said it was a failing drive, but later realized it's not? Did something change? What alerted you to think it was failing in the first place?)

So then there are the "emergency" options that try to force the import and rollback to a working "checkpoint". Those didn't work.

Your options are exhausted now.

The only thing I can think of is to try in a forum or discussion group that understands low-level ZFS and might have some miracle solution.

Other than that, there are data recovery options (if it's even feasible).

Finally, there's the bitter pill to swallow: you've lost everything in the pool, and you can only resort to a year-old outdated backup.

EDIT: This probably will result in "no pools available to import", but I'm curious:

Code:

zpool import -D

Does it show up as a "destroyed" pool, or does it output that there are "no pools available to import"?

jgreco · Jun 18, 2023

Dawson said:
Drive A - wiped <--- the mistake Imade when wiping via Proxmox

The part that I cannot quite figure out is which ZFS component this is. My suspicion is that it is the "missing" device and it sounds from the description like Proxmox may have done a simple partition table blow-away. I get the uncomfortable feeling that it might be very possible to repartition the disk and make it available to ZFS, but having done partition table reconstructions in the past, it can be a bit finicky.

This is one of those times where it'd be really interesting to be able to experiment on this in a virtual environment that had a bunch of free disk space and the ability to snapshot and roll back, and would allow us to try things like a disklabel edit for the gptid, etc. I'm just not convinced that a crappy hypervisor's insta-nuke of a disk would actually make the disk unretrievable. The data could still be out there.

HoneyBadger · Jun 19, 2023

jgreco said:
Hey, @HoneyBadger ... do you have any good recovery suggestions here? This feels like there should be something obvious but I really almost never have to recover ZFS pools, so my Zfu is weak in this area.

Oofda. I think I'm caught up on where things stand right now, I'll see if I can add anything helpful.

@Dawson the issue hopefully is described correctly below:

RAIDZ1 (HEALTHY):
Drive A - good
Drive B - good
Drive C - good

(Obviously this is fine.)

RAIDZ1 (DEGRADED):
Drive A - good
Drive B - failed <---- *so i thought
Drive C - good

(At this point ZFS is writing to both Drive A and Drive C, and transaction counts are increasing. Drive B is offline and is NOT increasing its transaction count.)

RAIDZ1 (DEAD):
Drive A - wiped <--- the mistake Imade when wiping via Proxmox
Drive D - new drive
Drive C - good

(The pool is now unavailable because of lack of replicas.)

RAIDZ1 (DEGRADED): <---- current state
Drive A - wiped
Drive B - good <-- replaced new drive with old "failed drive" (that didn't actually fail)
Drive C - good

(At this point Drive A is unusable, Drive B is "living in the past" several dozen/hundred transaction groups behind, and Drive C is in the present.)

Drive B and C are both "good" in that they are physically functional and have pieces of a working ZFS RAIDZ1 pool, they just disagree (potentially by a large margin) what time it is, so we're now into applied temporal mechanics.

The way I see it we have two methods to attempt recovery. Both of these require an additional brand new 4T drive for optimal safety - you'll use this drive and the previous one you bought to replace the not-failed Drive B as blanks to clone.

1. Preferred Method: Clone A + C to new blank drives. Try to rebuild and restore the quick-formatted partition table/gptid labels on Cloned Drive A, attempt to import with CloneA+CloneC "in the present" and then resilver with Drive B. Hopefully no/minimal data lost.

2. Not-preferred Method: Clone B + C to new blank drives. Attempt to import CloneB+CloneC "in the past" to the point where Drive B fake-failed. Data added since then is discarded.

Let's see what we can make of this. It looks like the zpool.cache file was already deleted (unless you can pull a copy from a backup somewhere?) so let's try getting an SSH session, running zdb -l /dev/adaX for each of ada0/1/2 and put the output into [code][/code] tags. Identify which disk is which from "Drive A" and "adaX" if you can.

If we've got valid labels saved on Drive A (IIRC, ZFS saves four copies) then we can try the first recovery method. If Proxmox somehow torched all four, then we're limited to the second.

We'll do our best here to help you out.

Dawson · Jun 19, 2023

HoneyBadger said:
Oofda. I think I'm caught up on where things stand right now, I'll see if I can add anything helpful.

@Dawson the issue hopefully is described correctly below:

RAIDZ1 (HEALTHY):
Drive A - good
Drive B - good
Drive C - good

(Obviously this is fine.)

RAIDZ1 (DEGRADED):
Drive A - good
Drive B - failed <---- *so i thought
Drive C - good

(At this point ZFS is writing to both Drive A and Drive C, and transaction counts are increasing. Drive B is offline and is NOT increasing its transaction count.)

RAIDZ1 (DEAD):
Drive A - wiped <--- the mistake Imade when wiping via Proxmox
Drive D - new drive
Drive C - good

(The pool is now unavailable because of lack of replicas.)

RAIDZ1 (DEGRADED): <---- current state
Drive A - wiped
Drive B - good <-- replaced new drive with old "failed drive" (that didn't actually fail)
Drive C - good

(At this point Drive A is unusable, Drive B is "living in the past" several dozen/hundred transaction groups behind, and Drive C is in the present.)

Drive B and C are both "good" in that they are physically functional and have pieces of a working ZFS RAIDZ1 pool, they just disagree (potentially by a large margin) what time it is, so we're now into applied temporal mechanics.

The way I see it we have two methods to attempt recovery. Both of these require an additional brand new 4T drive for optimal safety - you'll use this drive and the previous one you bought to replace the not-failed Drive B as blanks to clone.

1. Preferred Method: Clone A + C to new blank drives. Try to rebuild and restore the quick-formatted partition table/gptid labels on Cloned Drive A, attempt to import with CloneA+CloneC "in the present" and then resilver with Drive B. Hopefully no/minimal data lost.

2. Not-preferred Method: Clone B + C to new blank drives. Attempt to import CloneB+CloneC "in the past" to the point where Drive B fake-failed. Data added since then is discarded.

Let's see what we can make of this. It looks like the zpool.cache file was already deleted (unless you can pull a copy from a backup somewhere?) so let's try getting an SSH session, running zdb -l /dev/adaX for each of ada0/1/2 and put the output into [code][/code] tags. Identify which disk is which from "Drive A" and "adaX" if you can.

If we've got valid labels saved on Drive A (IIRC, ZFS saves four copies) then we can try the first recovery method. If Proxmox somehow torched all four, then we're limited to the second.

We'll do our best here to help you out.

Yes, you followed that perfectly. Just to confirm I will need to buy another new 4tb? if so, I'll get that ordered today. Also, any cloning and partition rebuilding software you'd recommend? I've never used any sort of drive recovery software. I do have a backup I could restore to, that has the old zpool.cache file. This sounds like a solid plan, you are the man! Just one thing, I'm not sure how "non-preferred method #2" is any different than what we've been trying (other than the fact that you'd have me clone the drives to different drives). Or am I not following correctly? THANK YOU for your help!

HoneyBadger · Jun 19, 2023

Dawson said:
Yes, you followed that perfectly. Just to confirm I will need to buy another new 4tb? if so, I'll get that ordered today. Also, any cloning and partition rebuilding software you'd recommend? I've never used any sort of drive recovery software. I do have a backup I could restore to, that has the old zpool.cache file. This sounds like a solid plan, you are the man! Just one thing, I'm not sure how "non-preferred method #2" is any different than what we've been trying (other than the fact that you'd have me clone the drives to different drives). Or am I not following correctly? THANK YOU for your help!

The reason I'm recommending the additional drives is out of an abundance of caution, the fact that it isn't my data at risk, and that your most recent backup is "rather aged" by your own admission. I'm hoping that we can find a way to get you back to at least a point more recent than that.

Before performing any of the clone operations, record serial numbers of the disks and confirm them in the cloning software UI wherever possible.

On bare-metal, if I wanted a simple UI, I'd use something like Clonezilla (possibly combined with a physical write blocker) or a liveCD using dd - since you have Proxmox as a host OS which is Debian-based, you could use dd from Proxmox, but again, be absolutely sure you are specifying the correct source and destination disks. If you DD an entire empty disk onto a good one, there's no walking that back.

Partition table editing is likely going to be a manual effort here, where we gpart list the table from one disk and then manually create it on the other, ditto the label edits. If zdb finds labels on Disk A that's a good sign though.

For data recovery a key principle is "don't ever write back to media you're attempting to recover from" so restoring the old zpool.cache file would be best done if you can restore it to a separate location (network drive) or even if you can use a separate boot device (or Proxmox VM?) to make a fresh TrueNAS install and restore the zpool.cache file to it. (Disconnect your data disks when you're doing it though.)

Regarding the differences, you've already hit the point of using the -FX switch so we're at the point of going manually spelunking for older transaction groups with zdb and then trying to import the pool at that time using -T txg.

Important Announcement for the TrueNAS Community.

HELP ZFS Pool data recovery

Resident Grinch

Explorer

Resident Grinch

MVP

Explorer

Explorer

Explorer

MVP

MVP

Explorer

Explorer

Resident Grinch

Explorer

Old Man

Explorer

MVP

Resident Grinch

actually does care

Explorer

actually does care

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "HELP ZFS Pool data recovery"

Similar threads