System keeps crashing

Status
Not open for further replies.

Grantp

Contributor
Joined
Feb 26, 2013
Messages
111
As of yesterday my FreeNAS server keeps on crashing. All is running fine then it just powers down and automatically tries to power up. When trying to power up the fans spin up for 1/2 seconds then down again. This can happen 5-10 times and system will then reboot, but sometimes it can take 30-40 minutes before it reboots.

Whilst trying to boot if I go into IPMI and look under Post Snooping I am getting a Code 15, but I can't find any info as to what this code means. Anyone know what it means?

The longest the system has stayed up once it has managed to re-boot in about 20 mins, then it goes through the same sequence of trying to boot again.

It sounds like a hardware fault to me, I am in touch with SuperMicro but so far the only help I've had is

1. Remove ALL power from system
2. Remove Mainboard BIOS battery
3. short the 2 metal connections on the battery socket with a metal tool for ~5 seconds
4. re-insert BIOS Battery and restore power to the system

I have tried the above numerous times but to no avail.

Are there any logs that FreeNAS creates that may give me some more info of what is actually happening, I have 'Enable debug kernel' & 'Enable automatic upload of kernel crash dumps' options selected in the Advanced Settings menu.

Has anyone got any ideas of what the problem may be?

System build info is in my signature below.

Many thanks Grant
 

term

Dabbler
Joined
Jan 14, 2014
Messages
10
First thing you need to do is to verify voltages coming from your power supply. In a perfect world you would want to check them with a scope so you could see if there is boucing going on. Failing that, swap it out.

If that does not work, move on to the CPU. Swap it out with a spare.

Then repeat with the motherboard.

This is a hardware problem, treat it as a simple process of elimination.
 

Grantp

Contributor
Joined
Feb 26, 2013
Messages
111
Thanks for your reply term. Unfortunately I am a home user so don't have spare Power Supply / CPU to swap in and out
 

ser_rhaegar

Patron
Joined
Feb 2, 2014
Messages
358
If you have a local computer repair shop, a lot of ma and pop stores will bench your equipment for free or for a reasonable price. They might not have a compatible proc or motherboard (most don't deal in Xeons) but they should be able to try a different PSU for you at least.
 

indy

Patron
Joined
Dec 28, 2013
Messages
287
You could run memtest for an easy start.
After that I would install windows on some old hard disk and run prime95.
Might not help at all but at least you dont have to tinker with your hardware too much.
 

Grantp

Contributor
Joined
Feb 26, 2013
Messages
111
You could run memtest for an easy start.
After that I would install windows on some old hard disk and run prime95.
Might not help at all but at least you dont have to tinker with your hardware too much.

Just to show what a NOOB I am, how do I run memtest from my FreeNAS box.
 

indy

Patron
Joined
Dec 28, 2013
Messages
287
Its super simple.
Download the usb-stick auto-installer from http://www.memtest.org/
After installing just boot from the stick and the test will run.
 

Grantp

Contributor
Joined
Feb 26, 2013
Messages
111
Thanks indy - Running MemTest as I type.
 

Grantp

Contributor
Joined
Feb 26, 2013
Messages
111
Left test running while I went out. Just arrived back and memtest had completed with 'No Errors'. System had been up for over 3 hours. Stopped the test and tried to re boot back to FreeNAS and it struggled to boot, it's eventually booting after about 40 attempts.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I, too, suspect the power supply. It's $40 to try to fix it. In the United States, you might get something like this, for cheap, just to have around for this situation:

http://www.newegg.com/Product/Product.aspx?Item=N82E16817339020&ignorebbr=1

It's a perfectly good power supply for emergencies, or for situations like yours. Only set you back, what $25 or something. I'd suggest having one around--it's the most likely component to fail in most boxes.
 

Grantp

Contributor
Joined
Feb 26, 2013
Messages
111
Thanks DrKK, I hadn't thought about getting a cheap power supply just to test it out/and emergencies.

As an update to my problem as I said above after running MemTest I rebooted eventually and I've had no problems since. Powered down / Reset all works perfect.

No idea what problem was SuperMicro Support said it had to be a Power Supply problem but never did answer my question about what 'Boot Post Code 15' actually meant.

Anyway for now I am up and running again, Thanks for all help suggestions you guys gave.

Cheers Grant
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
I use OCCT TP for stability testing. Windows only though. Good thing about it is, if it gives you an error you have a problem, no need to guess if it was a fluke.
http://www.ocbase.com/

A bad connection to the PSU maybe?
 
Status
Not open for further replies.
Top