Panic When Resilvering after Replacing Disk

Status
Not open for further replies.

bollar

Patron
Joined
Oct 28, 2012
Messages
411
I sm in the process of replacing the drives on one vdev of a 4 X raidz-2 pool. The first two drives went fine. On the third drive, very roughly 20 minutes into the resilver process, the system panics and reboots and attempts to resilver from the beginning. I have tried changing the destination drive to a fresh HD as well as removing the original disk. I have not been able to get the resilver to complete and the current state is degraded.

I've never seen a panic on FreeBSD before and I don't know where to start to diagnose the problem. Any advice?

1545788781119.jpeg


System:

FreeNAS 11.2-RELEASE
Platform: Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz, 16 Cores @ 2.2GHz x 2
Supermicro X9DRD
LSI SAS2308 PCI-Express Fusion-MPT SAS-2
Chelsio S320e Dual-Port 10GBe
CPU VM Support: Full
Memory: 64GB
Chassis: Chambro 40700 4U-48 bay / SAS-2 Backplane
HDD: 6TB x 16, 4TB x 5, 3TB x 3, 2TB x 10, 1.5TB x 12
SSD: 120GB x 2 (mirrored boot), 18.64GB x 1 (SLOG
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411
With the original drive to be replaced removed and no target inserted, a resilver completed. Decided to put one of the target drives back in and we'll see what happens this time. That reintroduced the panic.

Code:
pool: tank


state: DEGRADED


status: One or more devices could not be opened.  Sufficient replicas exist for


    the pool to continue functioning in a degraded state.


action: Attach the missing device and online it using 'zpool online'.


   see: http://illumos.org/msg/ZFS-8000-2Q


  scan: resilvered 0 in 0 days 13:01:44 with 0 errors on Wed Dec 26 05:52:37 2018


config:





    NAME                                            STATE     READ WRITE CKSUM


    bollar                                          DEGRADED     0     0     0


      raidz2-0                                      ONLINE       0     0     0


        gptid/ec7bb72e-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/eba940e2-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/ed468c77-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/4053ba3b-aed7-11e7-a9dd-0025909434fc  ONLINE       0     0     0


        gptid/976c6aaf-b2cc-11e7-b3a1-0025909434fc  ONLINE       0     0     0


        gptid/2f754939-b05c-11e7-b3a1-0025909434fc  ONLINE       0     0     0


        gptid/298ebf65-341e-11e8-a3f9-00074307578b  ONLINE       0     0     0


        gptid/f27c190c-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


      raidz2-1                                      ONLINE       0     0     0


        gptid/f42213d8-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/f57c6efa-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/f66364a1-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/f71b89a9-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/f849bcf2-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/f9a941fd-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/faf2b1e5-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


        gptid/fc168dba-0aba-11e7-8136-0025909434fc  ONLINE       0     0     0


      raidz2-3                                      ONLINE       0     0     0


        gptid/643db24f-15b8-11e8-98af-00074307578b  ONLINE       0     0     0


        gptid/6593ff26-15b8-11e8-98af-00074307578b  ONLINE       0     0     0


        gptid/6747ea3f-15b8-11e8-98af-00074307578b  ONLINE       0     0     0


        gptid/68f766f3-15b8-11e8-98af-00074307578b  ONLINE       0     0     0


        gptid/6b4e6f1e-15b8-11e8-98af-00074307578b  ONLINE       0     0     0


        gptid/f3547a81-1996-11e8-86d7-00074307578b  ONLINE       0     0     0


        gptid/26c0673d-6c0d-11e8-b265-00074307578b  ONLINE       0     0     0


        gptid/cc712a62-6c5c-11e8-b265-00074307578b  ONLINE       0     0     0


      raidz2-4                                      DEGRADED     0     0     0


        2437984594064647901                         UNAVAIL      0     0     0  was /dev/gptid/eaadcad5-0882-11e9-942e-00074307578b


        gptid/b4c921fe-0737-11e9-b757-00074307578b  ONLINE       0     0     0


        gptid/8a111399-3393-11e8-a3f9-00074307578b  ONLINE       0     0     0


        gptid/bc85928e-6e8c-11e8-b265-00074307578b  ONLINE       0     0     0


        gptid/8d98777d-3393-11e8-a3f9-00074307578b  ONLINE       0     0     0


        gptid/a112faa5-d146-11e8-800b-00074307578b  ONLINE       0     0     0


        gptid/b6878f33-06e7-11e9-b757-00074307578b  ONLINE       0     0     0


        gptid/930e7916-3393-11e8-a3f9-00074307578b  ONLINE       0     0     0


    logs


      gptid/fc62586e-0aba-11e7-8136-0025909434fc    ONLINE       0     0     0





errors: No known data errors

 
Last edited:

bollar

Patron
Joined
Oct 28, 2012
Messages
411
Still sitting with a clean resilver, but degraded array and considering my options. A restore would take a long time (weeks?), so that's a last resort. I wonder if there's something else I should look at (like memory allocation or tunables) or try a live CD of a different OS with ZFS support, like OmniOS.

Or just leave the array "as-is" until another two drives fail in that VDEV and restore at that point.
 
Joined
Jan 18, 2017
Messages
525
Probably want to post a bug report about this so the devs can take a look at the problem.
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411
Probably want to post a bug report about this so the devs can take a look at the problem.
Yeah. I'd like to be able to give them pertinent info/logs, but I'm not sure what those would be.
 
Joined
Jan 18, 2017
Messages
525

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
The panic isn't a good thing.
In order to resolve your Degraded state, you need to remove the "Unavailable" drive from the pool. Until the drive remains unavailable, you will never be able to get rid of the Degraded state.

Once remove, you can run a scrub if you like.
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411
The panic isn't a good thing.
In order to resolve your Degraded state, you need to remove the "Unavailable" drive from the pool. Until the drive remains unavailable, you will never be able to get rid of the Degraded state.

Once remove, you can run a scrub if you like.
Thanks -- apparently not an option here. the device name has changed since my previous post, but this is the unavailable drive from above.
Code:
root@storage:~ # zpool detach tank 13182715825251778798


cannot detach 13182715825251778798: only applicable to mirror and replacing vdevs


root@storage:~ #
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
I think you have to use the name with the gptid such as this one:

/dev/gptid/eaadcad5-0882-11e9-942e-00074307578b
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411
Through the bug report, I was advised that another drive in the vdev was timing our and that caused the panic. I took it offline and was able to complete the resilver. Now in the process of replacing this drive.
 
Status
Not open for further replies.
Top