TrueNAS Scale - System freeze

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Now thats exciting:
I am able to access the Controller BIOS from within the mainboard bios.
The picture below shows the current firmware version. if p16 == Version 16 an update is probably a good idea :)

It does? Where? I'm not sure what generated that summary -- I think you mean that the BIOS did. The output from "mprutil show adapter" at the UNIX shell on a LSI 3008 looks like

Code:
# mprutil show adapter
mpr0 Adapter:
       Board Name: LSI3008-IT
   Board Assembly:
        Chip Name: LSISAS3008
    Chip Revision: ALL
    BIOS Revision: 0.00.00.00
Firmware Revision: 16.00.10.00
  Integrated RAID: no
         SATA NCQ: ENABLED
 PCIe Width/Speed: x8 (8.0 GB/sec)
        IOC Speed: Full
      Temperature: 70 C


Looking at the "Firmware Revision" here, it has P16 firmware (good) subrev 16.00.10.00 (slightly bad); 16.00.12.00 is the latest available. There are some release notes around that clarify that this probably isn't a big deal in this specific case though.

-> Do i update only the firmware?

You can, if you choose.

-> if i need to flash bios, is it: mptsas3.rom (sasbios_rel) or mpt3x64.rom(uefibsd_rel\Signed) ?

Depends on whether your system is booting EFI or legacy BIOS. You can also omit the BIOS. I personally do not like that because I strongly prefer the ability to debug any disk issues outside the UNIX environment.

Does somebody know how the correct procedure looks like?

It's "mprutil flash update firmware <filename>" or "mprutil flash update bios <filename>", a two step process, if memory serves.
 

friendlyguy

Dabbler
Joined
Nov 10, 2022
Messages
31
where? -> in mainboard bios there is a section for the controller:
1668774610208.png
 

Attachments

  • 1668774096190.png
    1668774096190.png
    45.7 KB · Views: 65
Last edited:

friendlyguy

Dabbler
Joined
Nov 10, 2022
Messages
31
btw: super fast response! Thanks for that.
Also interesting to see: i connected the minisas cables to the new hba and the entire system came up again.
The testpool i created is there, truenas booted off one of the ssds i had in a raid1 before and i was able to just add the second ssd as mirror to the boot pool. never expected that to be possible.
 

friendlyguy

Dabbler
Joined
Nov 10, 2022
Messages
31
i dont have mprutil on my system (can i install it?) but i got the controller firmware specs via storcli:
1668774803106.png
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
btw: super fast response! Thanks for that.

Thank the insomnia. It says yer welcome. ;-)

Also interesting to see: i connected the minisas cables to the new hba and the entire system came up again.

That is an intended outcome. Controllers should be agnostic and swappable; this is super important if your data is critical. Or probably even if it isn't.

The testpool i created is there, truenas booted off one of the ssds i had in a raid1 before and i was able to just add the second ssd as mirror to the boot pool. never expected that to be possible.

In years past, certain RAID cards (3Ware, etc) would force you to create virtual disks, which created a partition on the disk that held the virtual disk's contents. When you would install an OS, the OS installer would then install a partition table which would then be INSIDE this other partition. It meant that there was no way that you could easily pull out a drive and put it into another machine without the same kind of RAID controller. This is a horrible form of vendor lock-in disguised as technical necessity. Fortunately, this horrible practice seems to have faded into obscurity in recent years, and it leaves you with the ability to take drives out of a RAID1 set and have a good chance of having them work as-is. If you're having trouble wrapping your head around that, no worries. :smile: It just outlines one of the many reasons we're strongly in favor of LSI's IT-mode HBA's, which handle this stuff correctly.
 

friendlyguy

Dabbler
Joined
Nov 10, 2022
Messages
31
sde and sdh are not even in a pool. sdv is part of my pool. but... is a single drive that has unreadable sectors enough to freeze the system?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
It's possible you have a "hotspot" cooling issue around the PCIe slot area, especially if the fans are ramping their speed based on CPU temperatures rather than a sensor closer to the problem spot. An overheating HBA or component would certainly manifest itself as a fully frozen, non-responsive system [...]
I would make sure this is not happening, before trying to reinstall the OS on the boot drive.
But I am no expert of HBAs.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, in theory nothing should cause the system to freeze. But I did notice that this is on Linux, and it looks to me like the HBA may not be in IT mode, which creates a different state of affairs. I'm hoping someone who has a bit more familiarity with the Linux side of things can comment on this. It could simply be that the use of MFI/MRSAS firmware is as problematic on Linux as it is on FreeBSD. Hm.
 

friendlyguy

Dabbler
Joined
Nov 10, 2022
Messages
31
It shows its in IT Mode.
My next step is to upgrade the firmware to p16. Creating a freedos stick as we speak...
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Have you tried the old trick of scrubbing the controller connectors with alcohol, and baking it at 120F for 10 minutes to reflow the solder? This sounds like a physical issue with your controller or slot.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Then, send it back for a warranty replacement.
 

friendlyguy

Dabbler
Joined
Nov 10, 2022
Messages
31
this also happend with the previous controller: different brand.
i can replace the board and cpu, got a spare one.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Have you tried the old trick of scrubbing the controller connectors with alcohol, and baking it at 120F for 10 minutes to reflow the solder? This sounds like a physical issue with your controller or slot.

Considering that this thread started out with a completely different controller, I'd say it's likely that there's a problem with the system itself, not the HBA.
 

friendlyguy

Dabbler
Joined
Nov 10, 2022
Messages
31
i am in the truenas irc channel on liberanet atm. somebody recommended to use fio to write to a dataset as this would rule out smb. maybe its that...
 
Top