slow spa_sync on reboot following a panic

Status
Not open for further replies.
Joined
Oct 23, 2015
Messages
3
Hi,
Apologies in advance that my first forum post is a problem.

I have a Dell Powered 2950 running BIOS 2.5.0, 32Gb Ram, Dual Xeon CPU. 4x2Tb WD drives in Raid Z1, 1x 147Gb Seagate 15k SAS as Zil. 2x Broadcom Gigabit NIC in LACP port channel onto a Cisco 2960 switch. Boot environment is an 8Gb USB stick (I believe Kingston but it's in the internal USB socket on the motherboard so I'd have to de-rack the machine to check) Freenas version is 9.3.

Various plugins are installed - from memory these are:

Sabnzbd
Headphones
Couchpotato
Plex
Sonarr
Virtualbox - running 2x 2gb Linux VM's which were shut down at the time
BTSync

Having the day off work, I decided it was time to do some housekeeping and update my plugins while there was no one at home to complain that Plex had gone offline. Hit the upgrade for Plex plugin and went to make a coffee. Came back and it was still showing 50% updated - weird, it's usually done by now!

Left it running a further 20 minutes, then realised that I couldn't actually ping the Freenas Box.

I fired up the remote access card on the box and spotted that the box was showing an error related to being Out Of Swap Space and processes being killed. The box was hung, wouldn't respond to anything, so I hit the reset button.

Now out of swap space intrigues me as I have 32Gb of RAM and I believe the default is 2Gb swap space per disk, so all in all 40Gb of available "memory"

Having hit reset the machine has started booting and is now stuck with slow spa_sync

Last log lines I can see on the screen

Code:
Beginning ZFS volume imports

Importing 11152677867531535824

txg 5625957 open pool version 5000; software version 5000/5; uts 9.3-RELEASE-p2
txg 5625960 import pool version 5000; software version 5000/5; its 9.3-RELEASE-

slow spa_sync: started 1320 seconds ago, calls 65


I read on one of the forum posts that Ctrl+T could be pressed to get some status. I've done that, however I don't know what it means :P

Code:
load: 0.57 cmd: zpool 778 [zio->io_cv] 2511.25r 0.00u 40.92s 0% 8480k


My question really - what do I do next ? Is it a wait and see job, is it a reinstall Freenas on a new stick job?

I have my important data backed up by hourly rsync task to a Linux server elsewhere on the network, so it's not the absolute end of the world if it ends up being a rebuild, but I'd rather avoid the hassle if at all possible.

Any help appreciated

Matthew
 
Joined
Oct 23, 2015
Messages
3
Sorry, I forgot to add that the machine doesn't have RAID capability, the disks are passed straight through to Freenas with no read or write caching on the controller other than the read cache on the disks themselves.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Is this a virtual freenas installation?

Just so you know, the best course is to wait this out and let it sit for a few hours. The CTRL+T output shows that io is either in process or locked up, and the best thing is to wait this out and see what happens.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
SO update us after its sat for 6 hours or so. If you have any hard drive activity, that might be useful. If in 6 hours the light is on/blinking then we should wait longer.
 
Joined
Oct 23, 2015
Messages
3
Hi Cyberjock, thanks for the reply! It's a 100% real tin solution :smile: The machine has a Dell Remote Access Card which will let you view and reboot remotely like a KVM.

After what feels like forever, it's just finished booting! I left a ping going to the box, went shopping, made tea, drank a beer or two and it started responding, logged into the web interface and it's looking all there!

Plex didn't make it... but that's a small price to pay to have the rest of the box back and not have to go through a lengthy restore!

Do you have any clues on what might have happened ? Just seems strange to be out of memory when there's far more in there than the Freenas spec for the storage size. Of course the plex installer could have run crazy and eaten it all maybe as my Plex library is fairly large.

Matthew
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Upgrades to the plugins are finicky. No clue what went wrong with the upgrade. I know that in some situations it is possible to take 24 hours (or more) to upgrade plex. That's why I don't use plugins. I want to manage stuff myself, and when it breaks I can only blame me.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
With older versions of plex it needed to cp the plexdata directory to a tmp place before and after the upgrade. This folder can be 10's of GB in size and I wonder of the cp operation was eating up all your memory. This would explain all the strange behavior that has happened in the past around plex. I'm not sure how cp actually works but it should be able to copy large amounts of data without running out of memory. Glad you didn't lose your data and good job for waiting it out.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I believe that newer versions of FreeBSD use mmap() as part of the cp process, which has some interesting implications.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
In all seriousness, when people have problems where zpools won't mount on bootup, as a general rule:

1. If the system kernel panics during the mounting process, that's *very* bad news. Recovery is rarely possible. (if you are in this situation you should definitely be asking ZFS pros what to do as you are in a very precarious position)

2. If the system appears to do anything besides crash (even if disk activity appears to be non-existent) you're typically better off waiting it out. Generally within 6 hours you're going to be in one of two situations:

- The system is still appearing idle (this is bad news)
- The system appears to be doing *something* (in which case you should wait longer)

Easy flowchart, eh?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
In all seriousness, when people have problems where zpools won't mount on bootup, as a general rule:

1. If the system kernel panics during the mounting process, that's *very* bad news. Recovery is rarely possible. (if you are in this situation you should definitely be asking ZFS pros what to do as you are in a very precarious position)

2. If the system appears to do anything besides crash (even if disk activity appears to be non-existent) you're typically better off waiting it out. Generally within 6 hours you're going to be in one of two situations:

- The system is still appearing idle (this is bad news)
- The system appears to be doing *something* (in which case you should wait longer)

Easy flowchart, eh?

cm-50190-9511cbdf4410a9.jpeg
 
Status
Not open for further replies.
Top