Yesterday morning I was playing around with testing the AES-NI benchmark thread and I panic'd the machine during a scrub(automated every 1st and 15th at 0330. The issue may have occurred before now, but I only noticed it rebooting after the panic.
On bootup(and dmesg output) has:
I've spent the last 8 hours or so reading what others have done. I read every thread and nobody seems to have a solution they said worked aside from zeroing out the disk(or at least the gpt tables) and re-adding it to the array or using alternate utilities such as Parted Magic's 'gpt fdisk' command.
In the spirit of fixing this issue(and learning a little something) without using some other boot CD or wiping the drive, how do I fix this?
System specs:
FreeNAS-8.3.1-RELEASE-x64 (r13452)
E5606 with 20GB of RAM
ZFS v28 running 18x2TB on RAIDZ3
I've never had any problems with any of my disks, and reviewing the SMART data for the drive shows nothing to indicate anything is going wrong.
Here's some outputs that were commonly asked for by other people with the same issue...
It was mentioned by several people to use gpart recover /dev/da7 and gpart recovery but everyone that tried that said it didn't work. Some places even mention using a sysctl kern.geom.debugflags=0x10 before running the other commands. In their defense though, they seemed to have other issues that may have prevented the command from fixing everything anyway. It seems this issue was more widespread with FreeNAS .7 and on USB sticks.
So before I try either of those commands I'm curious as to if those are still the recommended ways to repair the issue or if I have something misunderstood. The threads I found were sometimes quite old(2011 or older). I'm just thinking someone should validate the correct command to execute for this error.
Some places even say this is an issue with FreeBSD and ZFS and should be ignored. But considering one disk has this issue and the rest don't I'm thinking this is something that should be fixed.
Any input from the FreeBSD wizards?
On bootup(and dmesg output) has:
Code:
GEOM: da7: the primary GPT table is corrupt or invalid. GEOM: da7: using the secondary instead -- recovery strongly advised.
I've spent the last 8 hours or so reading what others have done. I read every thread and nobody seems to have a solution they said worked aside from zeroing out the disk(or at least the gpt tables) and re-adding it to the array or using alternate utilities such as Parted Magic's 'gpt fdisk' command.
In the spirit of fixing this issue(and learning a little something) without using some other boot CD or wiping the drive, how do I fix this?
System specs:
FreeNAS-8.3.1-RELEASE-x64 (r13452)
E5606 with 20GB of RAM
ZFS v28 running 18x2TB on RAIDZ3
I've never had any problems with any of my disks, and reviewing the SMART data for the drive shows nothing to indicate anything is going wrong.
Here's some outputs that were commonly asked for by other people with the same issue...
Code:
# gpart show da7 => 34 3907029101 da7 GPT (1.8T) [CORRUPT] 34 94 - free - (47k) 128 4194304 1 freebsd-swap (2.0G) 4194432 3902834703 2 freebsd-zfs (1.8T) # gpart list da7 Geom name: da7 modified: false state: CORRUPT fwheads: 255 fwsectors: 63 last: 3907029134 first: 34 entries: 128 scheme: GPT Providers: 1. Name: da7p1 Mediasize: 2147483648 (2.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 65536 Mode: r1w1e1 rawuuid: 762670b2-4a95-11e2-bca4-0015171496ae rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b label: (null) length: 2147483648 offset: 65536 type: freebsd-swap index: 1 end: 4194431 start: 128 2. Name: da7p2 Mediasize: 1998251367936 (1.8T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e2 rawuuid: 763790a1-4a95-11e2-bca4-0015171496ae rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: (null) length: 1998251367936 offset: 2147549184 type: freebsd-zfs index: 2 end: 3907029134 start: 4194432 Consumers: 1. Name: da7 Mediasize: 2000398934016 (1.8T) Sectorsize: 512 Mode: r2w2e5 # zpool status pool: tank state: ONLINE scan: scrub repaired 0 in 17h23m with 0 errors on Mon Apr 1 21:23:24 2013 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz3-0 ONLINE 0 0 0 gptid/6fbb91d5-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/70448fd2-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/70c0c7b3-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/713de0d5-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/71e3eea1-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/728458d2-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/7326aebc-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/73c64f27-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/7468c69a-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/75045f96-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/75a0096a-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/763790a1-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/76d701fa-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/77759c5c-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/78190bd3-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/78bb9173-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/795a7052-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 gptid/79fbc7b0-4a95-11e2-bca4-0015171496ae ONLINE 0 0 0 errors: No known data errors
It was mentioned by several people to use gpart recover /dev/da7 and gpart recovery but everyone that tried that said it didn't work. Some places even mention using a sysctl kern.geom.debugflags=0x10 before running the other commands. In their defense though, they seemed to have other issues that may have prevented the command from fixing everything anyway. It seems this issue was more widespread with FreeNAS .7 and on USB sticks.
So before I try either of those commands I'm curious as to if those are still the recommended ways to repair the issue or if I have something misunderstood. The threads I found were sometimes quite old(2011 or older). I'm just thinking someone should validate the correct command to execute for this error.
Some places even say this is an issue with FreeBSD and ZFS and should be ignored. But considering one disk has this issue and the rest don't I'm thinking this is something that should be fixed.
Any input from the FreeBSD wizards?