We have Truenas Scale (23.10.0.1) on a server with a Adaptec Series 7 - ASR-71605 set with HBA mode. Now all disks are fine and everything works great but we are seeing the system hangin a few times a day with the following errors in our syslog
This leads to everything to hang from anywhere between a few seconds to a few minutes multiple times a day.
Now I found a healthy discussion about this same issue here: https://forums.unraid.net/topic/149...ausing-this-it-keeps-popping-up-in-my-syslog/
and also officially here: https://bugzilla.kernel.org/show_bug.cgi?id=217599
My issue is, that Truenas manages and packages its own kernel so I assume what ever fix was pushed into the main linux kernel it will take potentially years to make it to Truenas. I was wondering if anyone had suggestions to what we can do in order to cleanly get this to work on our system?
Code:
.... Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,31,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,43,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,34,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,22,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,24,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,24,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,50,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,50,0): Feb 6 13:01:09 betty kernel: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,24,0): Feb 6 13:01:09 betty kernel: aacraid: Host bus reset request. SCSI hang ? Feb 6 13:01:09 betty kernel: aacraid 0000:01:00.0: outstanding cmd: midlevel-0 Feb 6 13:01:09 betty kernel: aacraid 0000:01:00.0: outstanding cmd: lowlevel-0 Feb 6 13:01:09 betty kernel: aacraid 0000:01:00.0: outstanding cmd: error handler-0 Feb 6 13:01:09 betty kernel: aacraid 0000:01:00.0: outstanding cmd: firmware-32 Feb 6 13:01:09 betty kernel: aacraid 0000:01:00.0: outstanding cmd: kernel-0 Feb 6 13:01:09 betty kernel: aacraid 0000:01:00.0: Controller reset type is 3 Feb 6 13:01:09 betty kernel: aacraid 0000:01:00.0: Issuing IOP reset Feb 6 13:01:37 betty kernel: aacraid 0000:01:00.0: IOP reset succeeded
This leads to everything to hang from anywhere between a few seconds to a few minutes multiple times a day.
Now I found a healthy discussion about this same issue here: https://forums.unraid.net/topic/149...ausing-this-it-keeps-popping-up-in-my-syslog/
and also officially here: https://bugzilla.kernel.org/show_bug.cgi?id=217599
My issue is, that Truenas manages and packages its own kernel so I assume what ever fix was pushed into the main linux kernel it will take potentially years to make it to Truenas. I was wondering if anyone had suggestions to what we can do in order to cleanly get this to work on our system?