willll
Cadet
- Joined
- May 21, 2023
- Messages
- 5
Hi,
While checking my kernel logs I found a lot of this errors messages (/var/log/syslog) :
Or
Or
My understanding is that the LSI card cannot read the HDDs, at some point, and reset the HDD. That even prevent smartctl the finish its tests.
All the hard drives are impacted, and 5 of them are less than 4 months old, I would rule out the hard drives for now.
I already spend some times looking for my card's firmware :
I also tried to change the computer for a poweredge R720xd,
I have the following setup :
* Supermicro S3008L-L8E
* SFF-8644 to SFF-8643 card adapter,
* LRSACX36-24I SAS expander card
* 20 various Western Digital SATA HDDs
HDDs are powered through there own PSUs, 3 HDDs max per SATA power line.
I am not where to look at now.
While checking my kernel logs I found a lot of this errors messages (/var/log/syslog) :
Jun 18 10:15:24 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/f00b9403-bf92-11ed-a3c9-6c626d38329f error=5 type=1 offset=2990830026752 size=1019904 flags=40080c90
Jun 18 10:15:24 truenas kernel: mpt3sas_cm0: log_info(0x3112010a): originator(PL), code(0x12), sub_code(0x010a)
Jun 18 10:15:24 truenas kernel: mpt3sas_cm0: log_info(0x3112010a): originator(PL), code(0x12), sub_code(0x010a)
Jun 18 10:15:24 truenas kernel: mpt3sas_cm0: log_info(0x3112010a): originator(PL), code(0x12), sub_code(0x010a)
Jun 18 10:15:24 truenas kernel: mpt3sas_cm0: log_info(0x3112010a): originator(PL), code(0x12), sub_code(0x010a)
Jun 18 10:15:24 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/f00b9403-bf92-11ed-a3c9-6c626d38329f error=5 type=1 offset=2996811702272 size=4096 flags=180990
Jun 18 10:15:24 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/f00b9403-bf92-11ed-a3c9-6c626d38329f error=5 type=1 offset=2996811829248 size=4096 flags=180990
Jun 18 10:15:24 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/f00b9403-bf92-11ed-a3c9-6c626d38329f error=5 type=1 offset=2990831046656 size=1015808 flags=40080c90
Jun 18 10:15:24 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/f00b9403-bf92-11ed-a3c9-6c626d38329f error=5 type=1 offset=2996811546624 size=4096 flags=180990
Jun 18 10:15:24 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/f00b9403-bf92-11ed-a3c9-6c626d38329f error=5 type=1 offset=2990832062464 size=1019904 flags=40080c90
Jun 18 10:15:24 truenas kernel: sd 1:0:2:0: Power-on or device reset occurred
Or
Jun 18 09:53:49 truenas kernel: sd 1:0:2:0: [sdd] Unaligned partial completion (resid=177148, sector_sz=512)
Jun 18 09:53:49 truenas kernel: sd 1:0:2:0: [sdd] tag#1353 CDB: Read(16) 88 00 00 00 00 01 45 21 88 a8 00 00 06 58 00 00
Jun 18 09:53:49 truenas kernel: scsi_io_completion_action: 2 callbacks suppressed
Jun 18 09:53:49 truenas kernel: sd 1:0:2:0: [sdd] tag#1353 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jun 18 09:53:49 truenas kernel: sd 1:0:2:0: [sdd] tag#1353 Sense Key : Aborted Command [current]
Jun 18 09:53:49 truenas kernel: sd 1:0:2:0: [sdd] tag#1353 Add. Sense: Information unit iuCRC error detected
Jun 18 09:53:49 truenas kernel: sd 1:0:2:0: [sdd] tag#1353 CDB: Read(16) 88 00 00 00 00 01 45 21 88 a8 00 00 06 58 00 00
Jun 18 09:53:49 truenas kernel: print_req_error: 2 callbacks suppressed
Or
un 18 10:14:36 truenas kernel: sd 1:0:13:0: [sdo] tag#516 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK cmd_age=0s
Jun 18 10:14:36 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/ad1aefb6-c915-11ed-9838-001e4fc1342e error=5 type=1 offset=2975952191488 size=618496 flags=40080c90
Jun 18 10:14:36 truenas kernel: sd 1:0:13:0: [sdo] tag#516 CDB: Read(16) 88 00 00 00 00 01 5e 32 4e 98 00 00 00 58 00 00
Jun 18 10:14:36 truenas kernel: sd 1:0:13:0: [sdo] tag#485 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK cmd_age=0s
Jun 18 10:14:36 truenas kernel: blk_update_request: I/O error, dev sdo, sector 5875322520 op 0x0:(READ) flags 0x700 phys_seg 11 prio class 0
Jun 18 10:14:36 truenas kernel: sd 1:0:13:0: [sdo] tag#485 CDB: Read(16) 88 00 00 00 00 01 5e 32 44 d0 00 00 05 10 00 00
Jun 18 10:14:36 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/ad1aefb6-c915-11ed-9838-001e4fc1342e error=5 type=1 offset=2975952809984 size=45056 flags=180990
Jun 18 10:14:36 truenas kernel: blk_update_request: I/O error, dev sdo, sector 5875320016 op 0x0:(READ) flags 0x700 phys_seg 113 prio class 0
Jun 18 10:14:36 truenas kernel: zio pool=main vdev=/dev/disk/by-partuuid/ad1aefb6-c915-11ed-9838-001e4fc1342e error=5 type=1 offset=2975951527936 size=663552 flags=40080c90
Jun 18 10:14:37 truenas kernel: sd 1:0:13:0: Power-on or device reset occurred
My understanding is that the LSI card cannot read the HDDs, at some point, and reset the HDD. That even prevent smartctl the finish its tests.
All the hard drives are impacted, and 5 of them are less than 4 months old, I would rule out the hard drives for now.
>name -a
Linux truenas 5.15.79+truenas #1 SMP Mon Apr 10 14:00:27 UTC 2023 x86_64 GNU/Linux
I already spend some times looking for my card's firmware :
>sas3flash -c 0 -list
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.
Adapter Selected is a Avago SAS: SAS3008(C0)
Controller Number : 0
Controller : SAS3008(C0)
PCI Address : 00:44:00:00
SAS Address : 5003048-0-18d8-d402
NVDATA Version (Default) : 0e.01.00.08
NVDATA Version (Persistent) : 0e.01.30.28
Firmware Product ID : 0x2221 (IT)
Firmware Version : 16.00.12.00
NVDATA Vendor : LSI
NVDATA Product ID : SAS9300-8e
BIOS Version : 08.37.00.00
UEFI BSD Version : 18.00.00.00
FCODE Version : N/A
Board Name : LSI3008-IT
Board Assembly : N/A
Board Tracer Number : N/A
Finished Processing Commands Successfully.
Exiting SAS3Flash.
I also tried to change the computer for a poweredge R720xd,
I have the following setup :
* Supermicro S3008L-L8E
* SFF-8644 to SFF-8643 card adapter,
* LRSACX36-24I SAS expander card
* 20 various Western Digital SATA HDDs
HDDs are powered through there own PSUs, 3 HDDs max per SATA power line.
I am not where to look at now.