nightly crash, need help debugging

ykhodo · Oct 23, 2017

I have an install of FreeNAS that I set up last week. I am getting the nightly emails, but then when I wake up the AM, the machine refuses to accept connections, and there is nothing coming from the monitor. The machine has power and is still physically up.

The last things from /var/log/messages are the following:

Code:

Oct 23 03:00:14 hadrian kernel: arp: <redacted> moved from <redacted> on igb0
Oct 23 03:02:25 hadrian kernel: arp: <redacted> moved from <redacted> on igb0

There is nothing in /data/crash.

What would be the best way to debug this issue?

Thanks!

dlavigne · Oct 23, 2017

Post your hardware specs and the output of ifconfig.

ykhodo · Oct 23, 2017

dlavigne said:
Post your hardware specs and the output of ifconfig.

ryzen 1700x
asrock x370 taichi
wd red 4 x 4tb
wd red 2 x 5tb
intel 128gb ssd for the host (it's about 5 years old but at 97% health)
600w corsair psu

Code:

igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=2400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6>
	ether <redacted>
	inet <redacted> netmask 0xffffff00 broadcast <redacted>
	nd6 options=9<PERFORMNUD,IFDISABLED>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
	inet 127.0.0.1 netmask 0xff000000
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
	groups: lo
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	ether <redacted>
	nd6 options=1<PERFORMNUD>
	groups: bridge
	id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
	maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
	root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
	member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
			ifmaxaddr 0 port 9 priority 128 path cost 2000000
	member: epair4a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
			ifmaxaddr 0 port 8 priority 128 path cost 2000
	member: epair3a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
			ifmaxaddr 0 port 7 priority 128 path cost 2000
	member: epair2a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
			ifmaxaddr 0 port 6 priority 128 path cost 2000
	member: epair1a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
			ifmaxaddr 0 port 5 priority 128 path cost 2000
	member: epair0a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
			ifmaxaddr 0 port 4 priority 128 path cost 2000
	member: igb0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
			ifmaxaddr 0 port 1 priority 128 path cost 20000
epair0a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether <redacted>
	nd6 options=1<PERFORMNUD>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
	groups: epair
epair1a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether <redacted>
	nd6 options=1<PERFORMNUD>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
	groups: epair
epair2a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether <redacted>
	nd6 options=1<PERFORMNUD>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
	groups: epair
epair3a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether <redacted>
	nd6 options=1<PERFORMNUD>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
	groups: epair
epair4a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether <redacted>
	nd6 options=1<PERFORMNUD>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
	groups: epair
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether <redacted>
	nd6 options=1<PERFORMNUD>
	media: Ethernet autoselect
	status: active
	groups: tap
	Opened by PID 11584

ykhodo · Oct 23, 2017

I'll so add that I am running 4 jails - plex, transmission, couchpotato, and sickrage.

You can see the crash is happening at around 3:05.

dlavigne · Oct 26, 2017

Anything in /var/log/messages prior to the crash? Also, how much RAM on the system?

ykhodo · Oct 26, 2017

dlavigne said:
Anything in /var/log/messages prior to the crash? Also, how much RAM on the system?

nothing in the logs worthy of note. It's 32gb corsair ECC RAM.

I think my problem is related to https://forums.freenas.org/index.php?threads/freenas-11-u4-crashes-regularly.58382/

dlavigne · Oct 26, 2017

If it is, it should be resolved by BETA1 (or 11.1 if you prefer to wait for RELEASE).

ykhodo · Oct 30, 2017

Disabling just c6 or global c-state did not fix my problem. With both disabled in the bios, I have been up for 3 days with no crashes.

wackymole · Oct 30, 2017

It now appears to be random when it crashes. I was getting 5 days now it crashed within 2 days. I doubt it has anything to do with SMT because it happening usually in the middle of the night. I did not test for long term stability with my 11.0 build, but I don't remember anything like this happening. I will see what happens in 11.1, but I might have to change my freenas Ryzen recommendation to not stable. Keep up the great job on the bugs/roadmap!~Eventually

wblock · Oct 30, 2017

Does that platform actually enable ECC, or just allow the use of ECC RAM while ignoring the important error-correcting abilities?

Nightly reports cause a fair amount of disk activity. How old is that power supply, and of what quality?

Important Announcement for the TrueNAS Community.

nightly crash, need help debugging

ykhodo

Explorer

dlavigne

Guest

ykhodo

Explorer

ykhodo

Explorer

Attachments

dlavigne

Guest

ykhodo

Explorer

dlavigne

Guest

ykhodo

Explorer

wackymole

Explorer

wblock

Documentation Engineer

Similar threads

Important Announcement for the TrueNAS Community.

nightly crash, need help debugging

ykhodo

Explorer

dlavigne

Guest

ykhodo

Explorer

ykhodo

Explorer

Attachments

dlavigne

Guest

ykhodo

Explorer

dlavigne

Guest

ykhodo

Explorer

wackymole

Explorer

wblock

Documentation Engineer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "nightly crash, need help debugging"

Similar threads