LSI 9207-8i Firmware Question

Status
Not open for further replies.

tomspost

Cadet
Joined
Apr 3, 2013
Messages
7
Hello.

Quick question about the above LSI HBA. We acquired one and it has the newest firmware installed (P15).
In the documentation of Freenas is stated that there will be problems when using a firmware lower than P13 with Freenas 8.3.1 as the FreeBSD Driver is version 13.

What about if the firmware on the card is newer than the driver? Should I expect problems and therefore downgrade the firmware?
The card is working correctly at the moment....

Thanks,
Thomas
 

tomspost

Cadet
Joined
Apr 3, 2013
Messages
7
Spoke to early that it's running fine. :confused:

Got these errors this morning:

Jun 5 09:55:46 stor-2 kernel: mps0: mpssas_alloc_tm freezing simq
Jun 5 09:55:46 stor-2 kernel: mps0: timedout cm 0xffffff80012417e8 allocated tm 0xffffff80012393f0
Jun 5 09:55:46 stor-2 kernel: (pass19:mps0:0:25:0): REQUEST SENSE. CDB: 3 0 0 0 12 0 length 18 SMID 181 completed timedout cm 0xffffff80012417e8 ccb 0xffff
Jun 5 09:55:46 stor-2 kernel: ff002712d800 during recovery ioc 8048 scsi 0 state c xfer 0
Jun 5 09:55:46 stor-2 kernel: (noperiph:mps0:0:25:0): SMID 78 abort TaskMID 181 status 0x0 code 0x0 count 1
Jun 5 09:55:46 stor-2 kernel: (noperiph:mps0:0:25:0): SMID 78 finished recovery after aborting TaskMID 181
Jun 5 09:55:46 stor-2 kernel: mps0: mpssas_free_tm releasing simq
Jun 5 09:56:06 stor-2 kernel: mps0: mpssas_scsiio_timeout checking sc 0xffffff8001220000 cm 0xffffff8001269b18
Jun 5 09:56:06 stor-2 kernel: (pass19:mps0:0:25:0): LOG SENSE. CDB: 4d 0 4d 0 0 0 0 0 4 0 length 4 SMID 683 command timeout cm 0xffffff8001269b18 ccb 0xffffff002712d800
Jun 5 09:56:06 stor-2 kernel: mps0: mpssas_alloc_tm freezing simq
Jun 5 09:56:06 stor-2 kernel: mps0: timedout cm 0xffffff8001269b18 allocated tm 0xffffff8001239538
Jun 5 09:56:06 stor-2 kernel: (pass19:mps0:0:25:0): LOG SENSE. CDB: 4d 0 4d 0 0 0 0 0 4 0 length 4 SMID 683 completed timedout cm 0xffffff8001269b18 ccb 0xffffff002712d800 during recovery ioc 8048 scsi 0 state c xfer 0
Jun 5 09:56:06 stor-2 kernel: (noperiph:mps0:0:25:0): SMID 79 abort TaskMID 683 status 0x0 code 0x0 count 1
Jun 5 09:56:06 stor-2 kernel:
Jun 5 09:56:06 stor-2 kernel: (noperiph:mps0:0:25:0): SMID 79 finished recovery after aborting TaskMID 683
Jun 5 09:56:06 stor-2 kernel: mps0: mpssas_free_tm releasing simq
Jun 5 09:56:26 stor-2 kernel: mps0: mpssas_scsiio_timeout checking sc 0xffffff8001220000 cm 0xffffff800124ab40
Jun 5 09:56:26 stor-2 kernel: (pass19:mps0:0:25:0): LOG SENSE. CDB: 4d 0 50 0 0 0 0 0 4 0 length 4 SMID 296 command timeout cm 0xffffff800124ab40 ccb 0xffffff002712d800
Jun 5 09:56:26 stor-2 kernel: mps0: mpssas_alloc_tm freezing simq
Jun 5 09:56:26 stor-2 kernel: mps0: timedout cm 0xffffff800124ab40 allocated tm 0xffffff8001239680
Jun 5 09:56:26 stor-2 kernel: (pass19:mps0:0:25:0): LOG SENSE. CDB: 4d 0 50 0 0 0 0 0 4 0 length 4 SMID 296 completed timedout cm 0xffffff800124ab40 ccb 0xffffff002712d800 during recovery ioc 8048 scsi 0 state c xfer 0
Jun 5 09:56:26 stor-2 kernel: (noperiph:mps0:0:25:0): SMID 80 abort TaskMID 296 status 0x0 code 0x0 count 1
Jun 5 09:56:26 stor-2 kernel: (noperiph:mps0:0:25:0): SMID 80 finished recovery after aborting TaskMID 296

Server is a Supermicro SC847 with X9DRi-F motherboard. Has 128 GB RAM and the mentioned LSI 9207-8i Card with P15 firmware.
Zpool indicated an CHKSUM Error on one of the discs, but a scrub made afterwards did not find errors.

Thanks for any hints.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
No idea. IBM ServeRAID M1015 flashed to IT mode:

Code:
mps0: <LSI SAS2008> port 0x5000-0x50ff mem 0xd9e04000-0xd9e07fff,0xd9e40000-0xd9e7ffff irq 19 at device 0.0 on pci11
mps0: Firmware: 15.00.00.00, Driver: 13.00.00.00-fbsd
mps0: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
mps0: [ITHREAD]


Note firmware P15, driver P13

Drives all show up fine:

Code:
da3 at mps0 bus 0 scbus3 target 43 lun 0
da3: <ATA ST4000DM000-1F21 CC52> Fixed Direct Access SCSI-6 device
da3: 600.000MB/s transfers
da3: Command Queueing enabled
da3: 3815447MB (7814037168 512 byte sectors: 255H 63S/T 486401C)
etc


It just spent a day writing to its pool with no errors

Code:
[root@freenas] ~# zpool list
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
test    40T  28.9T  11.1T    72%  1.00x  ONLINE  /mnt
[root@freenas] ~# zpool status
  pool: test
 state: ONLINE
  scan: scrub repaired 0 in 0h2m with 0 errors on Tue Jun  4 04:06:38 2013
config:

        NAME                                            STATE     READ WRITE CKSUM
        test                                            ONLINE       0     0     0
          raidz3-0                                      ONLINE       0     0     0
            gptid/b7d989f0-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/b8700cbf-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/b904505a-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/b99ffb99-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/ba357257-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/bacbc58d-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/bb653858-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/bc03636d-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/bc9e6217-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/bd3d14a4-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
            gptid/bddbdc4c-cd05-11e2-a2bb-000c2920acf7  ONLINE       0     0     0
        logs
          gptid/be509231-cd05-11e2-a2bb-000c2920acf7    ONLINE       0     0     0

errors: No known data errors
[root@freenas] ~#


This is undergoing basic hardware qualification right now prior to becoming a candidate for use in production. It's now running a scrub. But this hardware's been working for several months on other drives with zero problems.
 

tomspost

Cadet
Joined
Apr 3, 2013
Messages
7
Thanks for your reply.

May I ask how you do stresstest your hardware as in the example you made above?

Another test we wanted to do was loading the original LSI driver module 'mpslsi.ko' and looking if this happens again....
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Sit it on the bench for a few days and walk through various things. Do not excuse any unexplained missteps or odd things that happen.

For initial qualification of hardware, increase that to weeks or months. This box has run FreeNAS on bare metal, Windows, and now ESXi. There are things you can do under each one to make sure that it works and is stable ... a lot of this has to do with duplicating the sort of abuse it might see in production, but outside of a production environment. Stress every subsystem. Then stress them all together. Then put it into a configuration that resembles how you intend to use it. Repeat. Possibly several or many times. We've not deployed this E5 platform before, and because it is a massive (Supermicro X9DR7-TF+) system, best to shake out problems before there are dozens of VM's dependent on it.

It is easier to isolate and eradicate problems before you become reliant on a bit of technology.
 
Status
Not open for further replies.
Top