panic: Solaris(panic): blkptr at 0x.. has invalid CHECKSUM

Joined
Jun 15, 2015
Messages
4
Issue:
I've been running FreeNAS for a few years now, since v9.3. I've been applying updates via the GUI all this time, but the last time I did this the system panic'd (see attached) while trying to mount the ZFS pool. At the time, I was on v11.0-u4. Previous boot environment available are v11.0-u2 and initial install. (Performed a rebuild early last year after ditching Corral).

I tried going back to a previous boot environment (there were 2 previous available) and had the same failure at ZFS pool import.

Then also tried booting to v11.1-U2, v11.1-U3 with the same result - panic during ZFS import.

I ran a 24hr memory test and all passed.

I then also booted the system in single user mode and tried mounting the volume - still the same result.

Now I suspect perhaps a hard drive is at fault. I don't have a deep technical understanding of the FreeNAS architecture, but do know my way around the unix operating system a little.

Can someone please point me in the right direction to resolving this issue? Specifically any troubleshooting/command tricks I can try to verify various aspects of the build.

My last resort is a rebuild, but my latest backup is more than a month old, and I'd like to recover as much as possible before rebuilding.

Any help is greatly appreciated.

Thanks in advance.
------------------------

System Config:

Code:
CASE:   Fractal Design Define Mini
M/B:	Asus H97M-PLUS  BIOS Ver.2202
CPU:	Intel Pentium G3258 (Anniversary Edition)
RAM:	32GB DDR3/1333MHz
HDDs:	4 x 3TB Seagate NAS HDD (ST3000VN000)
PSU:	Corsair 450W 80-plus
BOOT:	2 x Strontium 16GB USBs (mirrored)
RAID:	Z2-RAID
NAS:	FreeNAS v11.0-U4

------------------------
ID0z5FMZJQA-kR9XV9ZseKFgimCZbmuCqfwoaxJY-W2Oqk4PYsUW490yzjeeqeAi7UUKLkHJjOd6Wskxe1uODM7YDR3WpXr0Jyn5VbNUowxj5rLQcoXMyo5U8wrWKnvHuhEjMpbJNwhkpv-cBMK0RGruKm28a0LuQXGwZK2eUSAomkOQVUs9WOsfB2FZPWpIgngSguCJFM2c4TnrU3sDA4Ljc17vSYJYQOpyjsl7uP09pPegLI_6ITM8TDKTkQYL09VuEn17geMsGWBO-IzZ7vuEdYyCr3A9dHSABqIhGUTF5UZ0oJlxz3B4BwNXfQL0wr9hwLV62VgTAVHzaoJC3Ix--c2PMwRjKFYp3e0YgH1wUE7DFzbRCQ1dFpACB7XcO2aNjYNVzyDjd0W745DLChrolvOJqDxg8SJ-R7CICJFCmmVxhdxlGljzpnkBAp-h_6uiMtcn3AVSmc6Z8zjUNKzMV5s29j416EaWliJXL156QIpJ5PApB-UBISK9zXxxCo7QDyO2P2q95hfMahOt0LabUWFzBCF2Hn8ksl4_hib2gataXZGVLW1NGXjUE_louD-8KvRijybGyYmQboIDFBJJXJn6ff4W6XH8v6pX=w1278-h723-no

Initial error during boot.
oX5uqMe1GvWSpbw4F86tpq8DbF8un_ej4Sry6Lg1n3hkHYS6uu76gA5KyI31rPIzzr8rko9bvh34xujvN9TID6CKxqKg0WPoJ3Hdzty6X6iSSXe7qce_CTJ-kWqHC1yCn-Gh3IKgRt9ZW_WuWP-0iA9aGSF4fvQHb8B0HN9DhkaRAWmqdiwBJPWp1WuFgttNouujbZ14mFBqqpVabKk5U9xX0w68VVxAgdP4_YvAaW0cDtxhLxaboG6KSQoUs5kUuMI5dZU51viKFgNE9xtR2vqOADXysebaPnhxpM9tSimZTAMXu1WvRAvMLTHcxP4bCLcmJrV7K-o-gkJMAQJGuQAMvPHb7HQhuhfVLMMOpZyIXmu3TwBDJbuZNP687ToTyydjctRoF0uEQ3Xny1e00WWA-9gOUcMykijhhC4c0WBBXLFX_BPqphWZ_f8tfZbnyzks6FBuYqCTPMc52UF2_lg16HiBPcxJhcobAtFcO_l7QTGBivLHBYPSbky0wJv4vXoRTqKbVUVk3u8L-iq8vUA9-anJT-5OYJML4ZPEhQjaR7fcNF5VTzpHZtz-hkgT-uqXMagwPow4xIKK1tFZy_2IIYvxuU4KdGKYuHk-=w1278-h723-no

Output of 'zpool import' command (single user mode)
6XYaqME1Hiq6alCxf0xPusoWe0DuZTQU8OQWKQHsjWtLGP_AEvy9uq3oYHgR9cWjkprlFftb2NxCuPk3v2WD-4hXT9UXqk_R2599EPwSQjbTTE8CsI6EM4dtkEs2WpbwWqemN8W7S9GQkL0vweKS9ZrbkiJe9Nnx4d29ww-zWRqxgGF8WmSTXA5S4Jb6dLcNXEmExdQbFjYs1P2QL-h6zfklOPBS3QsJX3sWxF6Hfnd61c_6qZZCP-xRV85dfIaC3zo7FU-rcxCd6mJQzphYr0TBJ6_ENfJWPAP44Yf0MXy7Pk5bssUoJcWLrOBw3j7yjYwtKf79YMh_2hOMeW1UbQwLADj6scFj0bnEQvco0hy58EPteoJhgpnug0KxK7VHxOoJI-1KlLf8u4_Eq5or8BdbkZDkCqV8_UXzAgwPwA54BrM2FfT5BWiPnk9myZFkgJTaB1jWKrLCxvZpREzFIDvgLnTt0vak32_S6q1Q_oN8TZ-VPTlFfF7hgxzECshuViisFYmC3AJLg5aoGBX2-SV8zpZBj1soA81uHNh64j1MNKRNgJb2nMgM15bVJ-7p6xB_01JNa1O9etogVLSQ6UHKxlQ3uxcy3oLuBPlL=w1278-h723-no

Followed by 'zpool import -f VOL1' (single user mode).
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Your images are completely broken. This is exactly why we ask you to upload images to the forum instead of using external hosts.
 
Joined
Jun 15, 2015
Messages
4
My apologies. Here they are:

1. Initial error during boot, post initiating update from the GUI.
upload_2018-3-21_13-20-11.png


2. Output of 'zpool import' cmd, after booting into single user mode in a prior boot environment:
upload_2018-3-21_13-22-47.png


3. Output of ensuing 'zpool import -f VOL1' cmd (single user mode)
upload_2018-3-21_13-25-4.png


Thank you again.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Looks like some pretty nasty metadata corruption. If you're lucky and this was recent, you might be able to rollback a few TXGs to get to the bulk of your data.

You can also try to import it read-only. I doubt it'll work, but can't hurt to try.

This could very well have been caused by an error in RAM, but that's impossible to determine now.

A ZFS expert might have some other ideas that might allow for some data to be read, but don't count on it.
 
Joined
Apr 9, 2015
Messages
1,258
https://illumos.org/msg/ZFS-8000-EY

The pool has been written to from another host, and was not cleanly exported from the other system. Actively importing a pool on multiple systems will corrupt the pool and leave it in an unrecoverable state.

So sounds like you will have to force the import with -f which is also listed on the error message page at the very bottom. Basically something messed up during the shut down and rolling back makes the system think it's from a different system altogether is my guess. As long as you didn't change something on the pool, like update feature flags, that makes it incompatible with a different version it should be fine with a force import.

Actually going to the error message page is a pretty good first step, never seen theirs before but it explains everything.
 
Joined
Jun 15, 2015
Messages
4
@Ericloewe I did attempt a READ ONLY import, but that failed similarly.
@nightshade00013 The 3rd screenshot shows the output from an attempt at importing with the -f parameter. Same result.
Thanks for the quick responses !!
 
Joined
Apr 9, 2015
Messages
1,258
Ahhh, it's probably too late for me to be trying to read then.

Anyway https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222734 Probably explains what has happened and drives home the reason for ECC ram as in that posting a "bit flip" happened causing an issue.

Might try -Fn and see if it will recover with minimal issues and then try -F if the first one looks good.

https://www.freebsd.org/cgi/man.cgi?zpool(8)

-F Recovery mode for a non-importable pool. Attempt to return
the pool to an importable state by discarding the last few
transactions. Not all damaged pools can be recovered by using
this option. If successful, the data from the discarded
transactions is irretrievably lost. This option is ignored if
the pool is importable or already imported.

-n Used with the -F recovery option. Determines whether a non-
importable pool can be made importable again, but does not
actually perform the pool recovery. For more details about
pool recovery mode, see the -F option, above.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
You probably need lowercase f, too, so -f -F -n.
 
Joined
Jun 15, 2015
Messages
4
Is it possible to rollback txg's when you're not able to successfully mount a pool? If so, what's the procedure?
 

mrpizat

Cadet
Joined
Jun 10, 2019
Messages
1
were you ever able to resolve this issue? having a similar error. only getting there was a bit different. (lost a drive, replaced the drive, resilvered, there was bad ram, os went bad. trying too import in a new environment, getting that error)
 
Top