Tunable dev.mrsas.0.mrsas_io_timeout in loadable.conf ignored

Kramax

Cadet
Joined
Feb 12, 2020
Messages
4
I am trying to address a 180 second boot stall that I believe is caused by a SCSI command timeout regarding the SAS drive cage. It does not cause any functional problem but makes reboots much slower. I do not think I can solve the actual SCSI timeout, but the mrsas driver which is being used appears to have a timeout that I am supposed to be able to adjust that is 180 seconds. The mrsas man-page says:
Code:
     To change the I/O timeout value for a specific mrsas driver instance, set
     the following tunable value in loader.conf(5):

           dev.mrsas.X.mrsas_io_timeout=NNNNNN

     where NNNNNN is the timeout value in milli-seconds.

I set this tunable using the FreeNAS webgui with a type of "Loader":
Code:
dev.mrsas.0.mrsas_io_timeout="30000"

I verified that it did put it into /boot/loader.conf.local:
Code:
dev.mrsas.0.mrsas_io_timeout="30000" # H730 scsi timeout to 30 sec (msec raw value) as it always times out on boot, default is 180000
kernel="kernel"
module_path="/boot/kernel;/boot/modules;/usr/local/modules"
kern.cam.ctl.ha_id=0

I rebooted and it did not change the stall during boot and furthermore when I executed "sysctl dev.mrsas" on the command line it shows it did not change the value:
Code:
freenas% sysctl dev.mrsas   
dev.mrsas.0.SGE holes: 0
dev.mrsas.0.prp_count: 0
dev.mrsas.0.stream detection: 1
dev.mrsas.0.block_sync_cache: 0
dev.mrsas.0.reset_in_progress: 0
dev.mrsas.0.mrsas_fw_fault_check_delay: 1
dev.mrsas.0.mrsas_io_timeout: 180000
dev.mrsas.0.mrsas_debug: 31
dev.mrsas.0.io_cmds_highwater: 265
dev.mrsas.0.fw_outstanding: 0
dev.mrsas.0.reset_count: 1
dev.mrsas.0.driver_version: 07.709.04.00-fbsd
dev.mrsas.0.disable_ocr: 0
dev.mrsas.0.%domain: 0
dev.mrsas.0.%parent: pci3
dev.mrsas.0.%pnpinfo: vendor=0x1000 device=0x005d subvendor=0x1028 subdevice=0x1f49 class=0x010400
dev.mrsas.0.%location: slot=0 function=0 dbsf=pci0:2:0:0
dev.mrsas.0.%driver: mrsas
dev.mrsas.0.%desc: AVAGO Invader SAS Controller
dev.mrsas.%parent: 

I tried rebooting again and no change. I tried changing the type to "sysctl" and rebooted; the "sysctl dev.mrsas" command then did show to new value (30000) but it did not affect the boot stall. And the manpage clearly claims it should be put into loader.conf.

The problematic logs during boot look like this:
Code:
freenas uhub4: <no manufacturer Gadget USB HUB, class 9/0, rev 2.00/0.00, addr 5> on usbus0
freenas uhub4: 6 ports with 6 removable, self powered
freenas run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
freenas random: unblocking device.
freenas run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config
freenas mrsas0: Initiating Target RESET because of SCSI IO timeout!
freenas run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config
freenas mrsas0: Task management NOT SUPPORTED for CAM target:0
freenas mrsas0: target reset FAIL!!
freenas mrsas0: Initiaiting OCR because of TM FAILURE!
freenas mrsas0: [ 0]waiting for 10 commands to complete
freenas mrsas0: Reset Exit with 0.
freenas ses0 at mrsas0 bus 1 scbus1 target 32 lun 0
freenas ses0: <DP BP13G+EXP 3.35> Fixed Enclosure Services SPC-4 SCSI device
freenas ses0: 150.000MB/s transfers
freenas ses0: SES Device
freenas da9 at umass-sim0 bus 0 scbus3 target 0 lun 0

Although it is possible the "still waiting after xx seconds for xpt_config" message are caused by something else (I do NOT have a 1394 controller) the fact that the SCSI timeout is 180 seconds by default and the "Target RESET" message happens right after the last xpt_config messages makes it seem likely that it is this SCSI timeout.

Has anyone using mrsas driver had any luck in changing this value at boot time?

-Thanks
 

Kramax

Cadet
Joined
Feb 12, 2020
Messages
4
Fortunately I did have an PERC H330 that was in a machine that was getting returned; I swapped the H730 for the H330 and performed the dreaded "cross-flash to HBA330" procedure -- it worked! Now have a "PERC HBA330". I hooked everything up and booted -- no boot stall! It is disappointing that the so called "HBA mode" does indeed seem to not perform as advertised by its name for the H730. I tested normal performance and things look good. Also did nasty tests removing drives and clearing their labels and re-insert and everything passed with flying colors -- all the arrays were actually fault tolerant!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Yeah, one of the reasons we're insistent on the LSI HBA with IT firmware is because it correctly handles all those "nasty tests." It's good that you're checking that anyways.
 
Top