start by checking /var/log/messages
But solid advice has been given ( including on the closed bug where you commented:
https://bugs.freenas.org/issues/7499) clearly stating heat could cause this. Your response was:
Yes the disks are too hot (at times) but generally, when no scrub or smart long test is being performed, they run sub 50c - not ideal but they've done this for 8 months.
Definitely would like to cool better and or replace the whole system one day but as it stands, I'd love to know what log file I should be pulling in case this happens again.
As i said before: i've had this in the past a couple of times. Every time, EVERY. SINGLE. TIME. this was caused by hardware issues ( cpu too hot, failing boot device ) Fix the hardware , issue goes away.
if this is hardware related, logfiles might give you a hint, or not( i just spent 3 weeks hunting a hardware issue with freenas causing lock/reboot without a single logentry...)
Running like that for 8 months will not guarantee you will be able to run like that for 8 more.. Hardware breaks. By using it in less then ideal conditions, it breaks fast. If your disks are running at ~50 normally and > 50 when scrubbing, i can only imagine what the cpu and ram temp must be...
So ? Have you done anything about the heat issue ?:
Did you do something about the temperature of the disks ? ( 48°C COULD be an issue.. disks / boot device(s) not responding quick enough causing nginx django to crash)
What is your cpu temp ? (
sysctl a | grep temp)
Have you ran a memtest on the non-ecc memory ?
Did you do a scrub on the boot devices ?
have you tryed doing a fresh install on a new usb ( export config, install latest image, import config) to see if this helps ?