HELP ZFS Pool data recovery

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
@Dawson - Please do not try the command listed below, I'm asking a question on if this would be the correct command, I don't know it for certain.
I'm not even certain the format for the drive ident is correct. Make sure you wait for someone to tell you the next step if you want the best chance to recover your data. Also, do you have a spare clean drive to install? It's just a nice clean way to do it if you have a spare drive.

@winnielinnie Would the OP use the CLI command
Code:
zpool offline Tank gptid/11b39573-ad95-11ed-8d1c-7df9cea98351
to offline the drive since it appears they can't do it via the GUI? I've never done this specific command before.
 
Joined
Oct 22, 2019
Messages
3,641
Would the OP use the CLI command
A bit tired here, and having trouble following this thread, but as far as ZFS goes, that would be the ticket.

But...

...upon reading through this thread again, I realized the issue isn't offlining/replacing a disk. It's that the pool refuses to import in a "degraded" state, even with the "-F" flag, as suggested by @jgreco.

Any relevant "zpool" commands (other than "import") are moot at this current stage.

Constant I/O errors when trying to forcefully import the pool? Bad cable connections? HBA? Some weird shenanigans with Proxmox / virtualization?
 
Joined
Oct 22, 2019
Messages
3,641
What is the output of this:
Code:
zpool import -d /dev/gptid/
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80
Code:
root@truenas[~]# zpool import -d /dev/gptid/
   pool: Tank
     id: 2717787786726095806
  state: FAULTED
status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C
 config:

        Tank                                            FAULTED  corrupted data
          raidz1-0                                      DEGRADED
            gptid/11b39573-ad95-11ed-8d1c-7df9cea98351  UNAVAIL  cannot open
            gptid/11bac542-ad95-11ed-8d1c-7df9cea98351  ONLINE
            gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351  ONLINE
        logs
          gptid/111fa2ca-ad95-11ed-8d1c-7df9cea98351    ONLINE
root@truenas[~]#
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80
Just plugged the drives into a totally different computer running truenas on bare metal, and plugged in my drives. I get all the exact same errors. I don't think it's a cable issue or a VM issue. :(
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80
@Dawson - Please do not try the command listed below, I'm asking a question on if this would be the correct command, I don't know it for certain.
I'm not even certain the format for the drive ident is correct. Make sure you wait for someone to tell you the next step if you want the best chance to recover your data. Also, do you have a spare clean drive to install? It's just a nice clean way to do it if you have a spare drive.

@winnielinnie Would the OP use the CLI command
Code:
zpool offline Tank gptid/11b39573-ad95-11ed-8d1c-7df9cea98351
to offline the drive since it appears they can't do it via the GUI? I've never done this specific command before.
I've actually already tried that command, it fails. The output says: cannot open 'Tank' no such pool
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I've actually already tried that command, it fails. The output says: cannot open 'Tank' no such pool

Yeah that didn't have any chance of success due to the pool reference; that'd only work if ZFS was happy with the pool but missing a device.

I think we're into the realm of more arcane sorcery. I'm thinking that

rm -f /data/zfs/zpool.cache
zpool import -f -F Tank

is the next thing to try.
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80
Yeah that didn't have any chance of success due to the pool reference; that'd only work if ZFS was happy with the pool but missing a device.

I think we're into the realm of more arcane sorcery. I'm thinking that

rm -f /data/zfs/zpool.cache
zpool import -f -F Tank

is the next thing to try.
1687114893326.png
 
Joined
Oct 22, 2019
Messages
3,641
Yeah that didn't have any chance of success due to the pool reference; that'd only work if ZFS was happy with the pool but missing a device.

I think we're into the realm of more arcane sorcery. I'm thinking that

rm -f /data/zfs/zpool.cache
zpool import -f -F Tank

is the next thing to try.
I was going to suggest adding "-m" as well, or even outright only specifying the two remaining drives (with "-d" twice, once for each drive), while skipping the SLOG.

Using a combination of "-f" and "-F" and "-m" and "-X"

This (hopefully) takes the SLOG out of the picture. Even if it results in losing recent data, better than nothing at all. :confused:
 
Joined
Oct 22, 2019
Messages
3,641
Code:
root@truenas[~]# zpool import -d /dev/gptid/
pool: Tank
id: 2717787786726095806
state: FAULTED
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
The pool may be active on another system, but can be imported using
the '-f' flag.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C
config:

Tank FAULTED corrupted data
raidz1-0 DEGRADED
gptid/11b39573-ad95-11ed-8d1c-7df9cea98351 UNAVAIL cannot open
gptid/11bac542-ad95-11ed-8d1c-7df9cea98351 ONLINE
gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351 ONLINE
logs
gptid/111fa2ca-ad95-11ed-8d1c-7df9cea98351 ONLINE

Based on that output, a hail mary attempt:
Code:
zpool import -nfFXm -d /dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351 -d /dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351 Tank


You can repeat the above command without "-n" to commit:
Code:
zpool import -fFXm -d /dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351 -d /dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351 Tank


It's "dangerous", but at this point, not sure if you care. :wink:
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80

Based on that output, a hail mary attempt:
Code:
zpool import -nfFXm -d /dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351 -d /dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351 Tank


You can repeat the above command without "-n" to commit:
Code:
zpool import -fFXm -d /dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351 -d /dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351 Tank


It's "dangerous", but at this point, not sure if you care. :wink:
I don't care at this point. :grin: Does't look like it worked :(

Code:
root@truenas[~]# zpool import -nfFXm -d /dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351 -d /dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351 Tank
root@truenas[~]# zpool import -fFXm -d /dev/gptid/11bac542-ad95-11ed-8d1c-7df9cea98351 -d /dev/gptid/11c0215d-ad95-11ed-8d1c-7df9cea98351 Tank
cannot import 'Tank': one or more devices is currently unavailable
root@truenas[~]#
 
Joined
Oct 22, 2019
Messages
3,641
Out of ideas.

One last attempt, is to have it look under a different directory for eligible partitions/drives that contain the pool metadata:
Code:
zpool import -fFXm -d /dev 2717787786726095806


Using the pool ID instead of its name.
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80
How long does the following command typically take? I just started it after my last message.

Code:
root@truenas[~]# zpool import -f -F -m -X Tank
 
Joined
Oct 22, 2019
Messages
3,641
Just leave it. Cross your fingers, wait ten or so minutes.

At this point, I'm with the Grinch: we've entered the realm of whimsical magic.

It might be rolling back to a ("working") checkpoint, and discarding the ZIL.

EDIT: So about those backups...

EDIT 2: I remember he wrote "magic", so that's what I wrote. But when I scroll up, I discover that he wrote "arcane sorcery" the whole time. :oops: How'd I go from "arcane sorcery" to "whimsical magic" in my own mind? Maybe it's because my heart is filled with innocence and goodness?
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80
Just leave it. Cross your fingers, wait ten or so minutes.

At this point, I'm with the Grinch: we've entered the realm of whimsical magic.

It might be rolling back to a ("working") checkpoint, and discarding the ZIL.

EDIT: So about those backups...
Okay, I'll give it some time, how does up to an hour sound? I mean I only had maybe 1-1.5tb of data on those drives. I've heard of people running this command for 5 days with no output.
 
Joined
Oct 22, 2019
Messages
3,641
I wouldn't wait more than an hour, honestly. Not even ten minutes.

I had suggested this, but you already tried the other command first:
One last attempt, is to have it look under a different directory for eligible partitions/drives that contain the pool metadata:
Code:
Code:
zpool import -fFXm -d /dev 2717787786726095806

So about those backups...

:frown:
 

Dawson

Explorer
Joined
Jun 17, 2023
Messages
80
I wouldn't wait more than an hour, honestly. Not even ten minutes.

I had suggested this, but you already tried the other command first:


So about those backups...

:frown:
I stopped the last command, and it totally locked up the system, so I had to do a hard restart. I ran:
Code:
zpool import -fFXm -d /dev 2717787786726095806

And that also locked it up too.
 
Joined
Oct 22, 2019
Messages
3,641
You accidentally "wiped" a good drive in the RAIDZ1 vdev of a pool that was already degraded.

I'm not sure why "zpool import" is suggesting that two drives are available, when reality you only have one available.

First...

RAIDZ1 (HEALTHY):
Drive A - good
Drive B - good
Drive C - good


Then...

RAIDZ1 (DEGRADED):
Drive A - failed
Drive B - good
Drive C - good


Then...

RAIDZ1 (DEAD):
Drive A - failed
Drive B - wiped
<--- the mistake you made when wiping via Proxmox
Drive C - good

What's throwing everyone off is that "zpool import" without any flags suggests that two drives are healthy and available, which implies you can import the pool in a degraded state.

Yet this is not true. Why is "zpool import"... lying?

My super amateur low-IQ shot in the dark: The way Proxmox "wiped" the drive perhaps just erased a portion of it at the start of the drive? Yet there still remains zpool metadata at the end of the drive, which "zpool import" detects?

Honestly, this is getting into low-level stuff, and I'm just shooting in the wind.
 
Top