Server crashing once a month on average

dhoepp

Cadet
Joined
Dec 23, 2019
Messages
3
At first I thought it was a power outage or something, but it seems to be 3 times in a row 3-5 weeks apart each time.

When the server boots back up the screen reads "this is a freenas data disk and cannot boot system"
where you then need to select the OS disk in the boot selection screen to resume. however a restart from there automatically selects the OS disk

In /var/log/messages the entire day before the reboot shows these messages:
Code:
Dec 22 00:00:00 freenas syslog-ng[2471]: Configuration reload request received, reloading configuration;
Dec 22 00:00:00 freenas syslog-ng[2471]: Configuration reload finished;
Dec 22 04:55:24 freenas sshd[25391]: fatal: ssh_packet_get_cstring: string is too large [preauth]
Dec 22 04:55:24 freenas sshd[25392]: fatal: ssh_packet_get_cstring: string is too large [preauth]
Dec 22 07:53:53 freenas sshd[50356]: fatal: ssh_packet_get_cstring: incomplete message [preauth]
Dec 22 20:27:31 freenas ahcich5: Timeout on slot 21 port 0
Dec 22 20:27:31 freenas ahcich5: is 00000000 cs 00200000 ss 00000000 rs 00200000 tfd 40 serr 00000000 cmd 10009417
Dec 22 20:28:22 freenas ahcich5: Timeout on slot 22 port 0
Dec 22 20:28:22 freenas ahcich5: is 00000000 cs 00400000 ss 00000000 rs 00400000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:28:22 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:28:22 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:28:22 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:29:13 freenas ahcich5: Timeout on slot 23 port 0
Dec 22 20:29:13 freenas ahcich5: is 00000000 cs 00800000 ss 00000000 rs 00800000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:29:13 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:29:13 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:29:13 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
Dec 22 20:30:04 freenas ahcich5: Timeout on slot 24 port 0
Dec 22 20:30:04 freenas ahcich5: is 00000000 cs 01000000 ss 00000000 rs 01000000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:30:04 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:30:04 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:30:04 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retry was blocked
Dec 22 20:30:04 freenas ada4 at ahcich5 bus 0 scbus5 target 0 lun 0
Dec 22 20:30:04 freenas ada4: <ST32000644NS GGB8> s/n 9WM5LT2N detached
Dec 22 20:30:55 freenas ahcich5: Timeout on slot 25 port 0
Dec 22 20:30:55 freenas ahcich5: is 00000000 cs 02000000 ss 00000000 rs 02000000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:30:55 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:30:55 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:30:55 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:31:46 freenas ahcich5: Timeout on slot 26 port 0
Dec 22 20:31:46 freenas ahcich5: is 00000000 cs 04000000 ss 00000000 rs 04000000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:31:46 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:31:46 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:31:46 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
Dec 22 20:32:47 freenas ahcich5: Timeout on slot 31 port 0
Dec 22 20:32:47 freenas ahcich5: is 00000000 cs 80000000 ss 00000000 rs 80000000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:32:47 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:32:47 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:32:47 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:33:38 freenas ahcich5: Timeout on slot 0 port 0
Dec 22 20:33:38 freenas ahcich5: is 00000000 cs 00000001 ss 00000000 rs 00000001 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:33:38 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:33:38 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:33:38 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
Dec 22 20:34:35 freenas ahcich5: Timeout on slot 3 port 0
Dec 22 20:34:35 freenas ahcich5: is 00000000 cs 00000008 ss 00000000 rs 00000008 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:34:35 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:34:35 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:34:35 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:35:25 freenas ahcich5: Timeout on slot 4 port 0
Dec 22 20:35:25 freenas ahcich5: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:35:25 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:35:25 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:35:25 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
Dec 22 20:36:27 freenas ahcich5: Timeout on slot 9 port 0
Dec 22 20:36:27 freenas ahcich5: is 00000000 cs 00000200 ss 00000000 rs 00000200 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:36:27 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:36:27 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:36:27 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:37:18 freenas ahcich5: Timeout on slot 10 port 0
Dec 22 20:37:18 freenas ahcich5: is 00000000 cs 00000400 ss 00000000 rs 00000400 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:37:18 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:37:18 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:37:18 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
Dec 22 20:38:14 freenas ahcich5: Timeout on slot 13 port 0
Dec 22 20:38:14 freenas ahcich5: is 00000000 cs 00002000 ss 00000000 rs 00002000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:38:14 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:38:14 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:38:14 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:39:05 freenas ahcich5: Timeout on slot 14 port 0
Dec 22 20:39:05 freenas ahcich5: is 00000000 cs 00004000 ss 00000000 rs 00004000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:39:05 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:39:05 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:39:05 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
Dec 22 20:40:07 freenas ahcich5: Timeout on slot 19 port 0
Dec 22 20:40:07 freenas ahcich5: is 00000000 cs 00080000 ss 00000000 rs 00080000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:40:07 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:40:07 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:40:07 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:40:58 freenas ahcich5: Timeout on slot 20 port 0
Dec 22 20:40:58 freenas ahcich5: is 00000000 cs 00100000 ss 00000000 rs 00100000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:40:58 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:40:58 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:40:58 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
Dec 22 20:41:54 freenas ahcich5: Timeout on slot 23 port 0
Dec 22 20:41:54 freenas ahcich5: is 00000000 cs 00800000 ss 00000000 rs 00800000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:41:54 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:41:54 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:41:54 freenas (aprobe0:ahcich5:0:0:0): Retrying command
Dec 22 20:42:45 freenas ahcich5: Timeout on slot 24 port 0
Dec 22 20:42:45 freenas ahcich5: is 00000000 cs 01000000 ss 00000000 rs 01000000 tfd 50 serr 00000000 cmd 10008017
Dec 22 20:42:45 freenas (aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Dec 22 20:42:45 freenas (aprobe0:ahcich5:0:0:0): CAM status: Command timeout
Dec 22 20:42:45 freenas (aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted


I appreciate your time and consideration. Please let me know what further information you need.
 

dhoepp

Cadet
Joined
Dec 23, 2019
Messages
3
OS Version:
FreeNAS-11.2-U5
(Build Date: Jun 24, 2019 18:41)
Processor:
Intel(R) Atom(TM) CPU C2550 @ 2.41GHz (4 cores)
Memory:
8 GiB

The Motherboard is ASRock, and the case has hot swap hard drive bays up front.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Regular reboots/unresponsive system/hard crashes without another symptom makes me suspect a hardware fault. Are you running a fairly current (11.1 or above) version of FreeNAS?

For hardware, can you post your complete system specifications as suggested in the rules post:
  • Motherboard make and model
  • CPU make and model
  • RAM quantity
  • Hard drives, quantity, model numbers, and RAID configuration, including boot drives
  • Hard disk controllers and method of connection
  • Network cards
I would reseat your power and data cables, potentially your RAM as well, and proceed with booting up a memtest CD to check your system stability. If it is unable to complete the tests or throws errors, you have your answer.

Edit: I see you're on an ASRock C2550 - there is a known fault with earlier runs of Atom processors that caused them to die permanently. See here:


I suggest contacting ASRock, even if you are outside of warranty, to request a repair/replacement.
 

dhoepp

Cadet
Joined
Dec 23, 2019
Messages
3
ASRock Rack Mini ITX DDR3 1333 Motherboards (C2550D4I)
Intel Avoton C2550 Quad-Core Processor
DDR3 1600/1333 Dual-channel Max. 64GB UDIMM
2 SATA3 6.0Gbps, 4 SATA2 3.0Gbps by C2550, 4 x SATA3 6.0 Gb/s by Marvell SE9230, 2 x SATA3 6.0 Gb/s by Marvell SE9172
Dual Intel i210 Gigabit LAN ports (with Teaming function)

I have 4 2TB hard drives running a 4TB configuration connected over SATA cables
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
When the server boots back up the screen reads "this is a freenas data disk and cannot boot system" where you then need to select the OS disk in the boot selection screen to resume.

Have to agree with @HoneyBadger. The above issue is your BIOS loosing the configuration, nothing to do with FN. Your CMOS battery is probably dead, for the BIOS lose its config, check it. That has nothing to do with the reboot, by the way. Follow the steps from @HoneyBadger reply.
 

poldi

Dabbler
Joined
Jun 7, 2019
Messages
42
Hi,
that does sound like some hardware issue. Most likely related to your disk at ada4, if I look at this message here:
Dec 22 20:30:04 freenas ada4: <ST32000644NS GGB8> s/n 9WM5LT2N detached
Did you notice if the pool was resilvering when it came back online? FreeNAS should have send an alert if you have this configured correctly.
Also you can check /data/crash if this has any more information on the restart. You can check info.last first. textdump.tar.last.gz should have more detailed logs.
Honestly sounds to me like a failing drive or broken cable (more likely is the first one).
Hope that helps.
 
Top