Spontaneous reboots - but always at the same time

whosmatt

Dabbler
Joined
Jun 6, 2012
Messages
20
So, I understand that spontaneous reboots are often a problem with hardware. But here's my situation. Three times in the last 4 or 5 months, my home FreeNAS box has rebooted itself precisely at midnight Saturday night / Sunday AM. Spontaneous? Yep. Random? 3x at the exact same time says no.

A few things happen at that time:
1) The cron jobs for pool scrubs happen then. But it doesn't correlate with a scrub actually starting necessarily.
2) I have backup jobs for various systems in my home network that often start at midnight. Many of these write their backups to FreeNAS via NFS. Many of them also are VMs that live on ISCSI storage on FreeNAS. But those happen nightly, not just on Saturdays.

Those are the basics.

Hardware is:
Asus M5A99FX Pro R2.0 Motherboard
Dell Perc H310 flashed to Firmware version : 20.00.07.00
Seasonic FOCUS Plus 650 Gold SSR-650FX PSU
AMD Opteron 3260HE CPU
20GB Unbuffered DDR3 ECC RAM
HP NC-360T Intel 82571EB NIC
PCI Rage XL display card
7x 3TB Hitachi 7200RPM SATA
4x 8TB mixed seagate, toshiba, wd SATA
1x Adata 60GB SSD read cache


boot pool is 2x usb sticks, one sandisk 8GB and one Kingston 16GB
data pools are 6x3TB in RAIDZ2 with one hot spare and read cache ssd
and 3x 8TB in RAIDZ with one hot spare

I'm already sending syslog to an external server but nothing is standing out. Is there some more verbose logging I could set up? I just cranked it up to debug in the gui; but wondering if there's anything else i could look at that might not be picked up by syslog.

I thought at first that it might be load related but I can stress the box in every way I can think of (particularly live migrating VMs to/from it after a scheduled downtime for hardware maintenance or OS upgrade) and can't reproduce it.
 
Last edited:
Joined
Dec 29, 2014
Messages
1,135
Does it have IPMI or anything like that which could save system event logs? Any chance that there is some kind of power event happening then that causes it? If you manually kick off the things scheduled for that time, does it generate a reboot?
 

whosmatt

Dabbler
Joined
Jun 6, 2012
Messages
20
No, it's a workstation board so no IPMI. All I get is whatever syslog sends to my logstash server. I'll try kicking those cron jobs off manually and see what happens. I also moved them from Sunday to Wed just because.

I didn't think about power events. It's on a UPS along with a few low power esxi servers, switches, etc, none of which have any issues during the same time period. The UPS isn't network connected and i've never bothered with the USB interface but i guess i could look and see if it has any logs or scheduled tests or anything.
 
Last edited:

whosmatt

Dabbler
Joined
Jun 6, 2012
Messages
20
One proactive step I'm taking is that I just ordered a couple of SSDs to use in my boot pool. I've replaced too many usb flash drives over the years and now that a 120GB SSD is ~25 USD... you know.

Anecdotally, I had another FreeNAS system that would spontaneously reboot (at random) and was exhibiting no errors (ZFS or otherwise) in the boot pool but would take so long to boot or perform other operations that I assumed the flash drive(s) had failed. I replaced them with SSDs and brought the flash drives home for a post-mortem and couldn't find anything wrong with them other than that with ZFS only they were really, really slow. I imported the boot pool into a FreeBSD VM and it took an extraordinarily long time. Yet I was able to dd both devices into images and import those no problem and quick as you'd expect. Who knows. Anyway, fingers crossed, I haven't had any spontaneous reboots on that second system since moving from usb flash boot devices to SSDs.
 

whosmatt

Dabbler
Joined
Jun 6, 2012
Messages
20
Well, I just had an interesting event while applying the latest system update to FreeNAS-11.3-U2

I wish i could have gotten a screen shot faster, but i was watching the boot after the upgrade over serial console.

the last thing in the console (that I can see) before it rebooted is:


Updating CPU Microcode...


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 12
______ _
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Updating CPU Microcode...
You might be able to do that from a linux live USB or with something from your System BIOS manufacturer to avoid FreeNAS thinking it needs to do it in the first place.
 
Top