Kernel panic when upgrading TrueNAS 12 to 13

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
Recently tried to upgrade from TrueNAS 12.0-U8.1 to TrueNAS 13.0-U3.1 but when it tries to boot TrueNAS 13 it kernel panics with below message:

Screenshot 2022-12-25 144507.png
From what I can tell it seems like it's running some hardware tests and there is something about core 6 it doesn't like?

Going back to TrueNAS 12 with the old boot environment it works fine but as soon as I try to boot 13 it fails on this. Has there been some additional hardware checks introduced in 13? CPU is an Intel 6700K.

Running memtest right now, will get back with the results.
 
Joined
Oct 22, 2019
Messages
3,641
Quite possibly.

How are your CPU temps in general?

What about using a live USB/ISO to run mprime overnight to stresstest the CPU?
 

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
Used to have issues with high temps but just got a new cooler and got it repasted and it hasn't complained since. But maybe there is one core that gets too hot, I will try running a stress test.

You think that TrueNAS 13 is checking CPU temps at boot up and exits if too hot?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
difficult to tell when you don't list your hardware as the forum rules require.
 

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
Motherboard: Asus MAXIMUS VIII HERO
CPU: Intel 6700K
RAM: 32GB
Hard drives:
  • 6x WDC WD80EFAX-68L in raidz2
  • 2x Kingston SUV500120G boot array
Network: Intel I350-T2
Disk controller:
Code:
nas01# sas2ircu 0 display
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

Read configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
  Controller type                         : SAS2308_2
  BIOS version                            : 7.39.02.00
  Firmware version                        : 20.00.07.00
  Channel description                     : 1 Serial Attached SCSI
  Initiator ID                            : 0
  Maximum physical devices                : 1023
  Concurrent commands supported           : 10240
  Slot                                    : 16
  Segment                                 : 0
  Bus                                     : 1
  Device                                  : 0
  Function                                : 0
  RAID Support                            : No
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
ok, so not supported hardware for TrueNAS, but I don't see anything glaringly incompatible in your base config that would explain it not at least booting on 13. it's an intel NIC at least.

I do not believe it does the kinds of hardware tests you are thinking. there is something else wrong, but this kind of error is not very clear about what is wrong. more like a car "check engine" light.

looking up your boot drives, they seem to be encrypted? that sounds suspiciously like "abstraction of the disk", but not something I'm sure of.
my first instinct, looking at your kernel panic, is to check how the encrypted drives are handled and work, and look into replacing them.
 

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
Boot drives shouldn't be encrypted, where did you find that information?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
I was more thinking of a link to the source you used. But are you thinking of that the drives support SED? It's not something I use. Don't see how it would be a problem? A lot of drives support SED.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
i used...google? and 2 of the first 3 results, including kingstons own website, says that the SUV500's should be encrypted drives

im not sure how encrypted drives interact with what zfs expects a drive to do. zfs typically REALLY doesn't like anything that abstracts the hardware, and encrypting everything seems like it would abstract the hardware?
if something goes wrong, you're not gonna be able to boot the system, and since you cant boot the system, that is the first thing I would check.
 

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
From what I can see they are refering to SED. I see your point I just wish you could have made it without such a snarky tone. Will do some tests, I however think the problem lies elsewhere.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
You are getting an uncorrectable error in the L0 cache.
You can try:
Deactivating XMP RAM settings and any manual overclock you possibily made.
Reseating the RAM.
Reseating the CPU.
Reseating the PSU cables.
Clearing then CMOS.

It's not an issue directly linked with TN or any software since MCA events are definitely from hardware.
Are you hitting this constantly?

Let memtest go at least for a few days.
 
Last edited:

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
You are getting an uncorrectable error in the L0 cache.
ahh. is that the line that says this then? and probably the one above it?
"ICACHE L0 IRD error"
interesting.
 

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
I am unable to identify where the snarky tone was.
Instead of just giving me an example link so I know we are talking about the same thing you come back with another question and then this:
i used...google?
Yes I use Google too, that is why I have so few threads here since I solve my issues using Google. In hindsight I could have made it clear I was confused if you were referring to SED and why that would be an issue because I could not connect it complaining about the CPU being a SED issue.

Anyway I have now updated another system with identical TrueNAS version and an identical set of drives and it worked flawlessly. I've also run Prime95 and the blend test seemed to lock up after around 12 hours, running the first (1) test it ran fine for 24 hours with temps maxing out at around 65c.

It's not an issue directly linked with TN or any software since MCA events are definitely from hardware.
Are you hitting this constantly?
So far I've only tried two times with TrueNAS 13 but same issue both times. I was going to try a clean install of 13. But considering the Primee95 result I'm thinking RAM issue as well. I'm not using XMP as that caused instability in the past. Will go through and reseat everything to start with.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Unless I missed something, issue is in the CPU itself. It's strange that it bothers you only with 13.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Really, 13 has been very solid as a release till now.
If you still hit it after reseating everything and on a clean install of 13, and only with 13, I would try making a bug report.
But it might just be time to change your CPU. You can also try contacting your CPU manufacter for help understanding that error.
 

Agent92

Explorer
Joined
Feb 11, 2019
Messages
56
TrueNAS 13 installer exits with the same error, tried to load optimized defaults in the EUFI but same issue.
 
Top