FreeNAS 9.10.1-U4 failed to start up after server power reset

Status
Not open for further replies.

Konstantin

Dabbler
Joined
May 7, 2016
Messages
13
Hi, guys
We had a problem with >90% vol0 utilization, the server was unresponsive and one of my colleague power cycled it from ilo console.
Right now during the boot, we see this error:

Code:
slow spa_sync: started 1660 seconds ago, calls 133panic: I/O to pool 'vol0' appe
ars to be hung on vdev guid 7371700154187108762 at '/dev/gptid/dc2f0415-bd94-11e
6-8f13-d89d67189808'.
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0f9633f680
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe0f9633f730
vpanic() at vpanic+0x126/frame 0xfffffe0f9633f770
panic() at panic+0x43/frame 0xfffffe0f9633f7d0
vdev_deadman() at vdev_deadman+0x172/frame 0xfffffe0f9633f820
vdev_deadman() at vdev_deadman+0x41/frame 0xfffffe0f9633f870
vdev_deadman() at vdev_deadman+0x41/frame 0xfffffe0f9633f8c0
spa_deadman() at spa_deadman+0x89/frame 0xfffffe0f9633f8f0
softclock_call_cc() at softclock_call_cc+0x17b/frame 0xfffffe0f9633f9b0
softclock() at softclock+0x94/frame 0xfffffe0f9633f9e0
intr_event_execute_handlers() at intr_event_execute_handlers+0xab/frame 0xfffffe
0f9633fa20
ithread_loop() at ithread_loop+0x96/frame 0xfffffe0f9633fa70
fork_exit() at fork_exit+0x9a/frame 0xfffffe0f9633fab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0f9633fab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 12 tid 100027 ]
Stopped at	  kdb_enter+0x3e: movq	$0,kdb_why
db>



single user mode works fine.

Is there any way to save the data?

The hardware is hp gen9 360. 14drives, LSI cards (jbod)

Any help is appreciated,
Thank you!
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Is there any way to save the data?
Almost certainly a fresh installation to a fresh boot device will see your data just fine. But you should never let your pool get that full.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Upgrade won't make the trick?
It could--boot the installer ISO, tell it to install to your existing boot device, and it will offer to upgrade the existing installation. Or just install to a clean device, upload your saved config file (you have one, right?), and you'll be set.
 

Konstantin

Dabbler
Joined
May 7, 2016
Messages
13
Ok, I upgraded freenas. It takes ~1.5hour to import all zfs volumes.
After it works for some time (still shows the same error related to 90% utilization) and then hangs, becomes unresponsive.
I was trying to delete files from Freenas itself, it shows no files, but after every reboot they're still there.
Is there a way to drop those files?
Also, I tried just to drop dataset, didn't work :/
Thanks
 

Konstantin

Dabbler
Joined
May 7, 2016
Messages
13
Here're the processes which stuck:

root 5530 0.0 0.0 40524 3592 - D 16:07 0:00.01 zfs destroy -r vol0/sharedvol
root 5731 0.0 0.0 40524 3716 - D 16:10 0:00.01 /sbin/zfs get -r -H -o name,property,value,source -t filesystem,volume compression,compressratio,readonly,o
root 6556 0.0 0.0 40524 3716 - D 16:21 0:00.02 /sbin/zfs get -r -H -o name,property,value,source -t filesystem,volume compression,compressratio,readonly,o
root 6557 0.0 0.0 40524 3716 - D 16:21 0:00.02 /sbin/zfs get -r -H -o name,property,value,source -t filesystem,volume compression,compressratio,readonly,o
root 8549 0.0 0.0 40524 3724 - D 16:35 0:00.02 /sbin/zfs get -r -H -o name,property,value,source -t filesystem,volume compression,compressratio,readonly,o
root 9316 0.0 0.0 40524 3724 - D 16:49 0:00.01 /sbin/zfs get -r -H -o name,property,value,source -t filesystem,volume compression,compressratio,readonly,o
root 9455 0.0 0.0 40524 3724 - D 16:50 0:00.02 /sbin/zfs get -r -H -o name,property,value,source -t filesystem,volume compression,compressratio,readonly,o
 

Konstantin

Dabbler
Joined
May 7, 2016
Messages
13
I cleaned up the disk, but...That's weird. zpool vol0 shows that it's used by 82%. Actual used space on volume vol0 is 46%
Also freenas extremely slow. zfs commands take forever, let's say change quotas, create new dataset).
I checked the scrub process on vol0 it's not running.
What's more interesting it shows 7.2T but that's raidz2 with 8 drives :D, which is actually impossible!
Any suggestions?
Thanks
 

Attachments

  • Screen Shot 2017-09-26 at 8.11.46 PM.png
    Screen Shot 2017-09-26 at 8.11.46 PM.png
    45.2 KB · Views: 200
Last edited:
Status
Not open for further replies.
Top