I migrated from Core to Scale on May 30 so that I could use the Plex plugin to enable hardware transcoding on my CPU. Recently Plex dropped support for this in FreeBSD. After I migrated, I found that I was getting data corruption from my old SATA card and I purchased a new LSI Logic Controller Card LSI00301 SAS 9207-8i card from Amazon. I switched my Exos 16 TiB drives to this new controller and the number of errors was reduced but then came back. They were not associated with any particular drive, but just to make sure, installed my spare and the errors showed up on it too. When I looked into the log, I found the following:
I am getting a bit desperate. Does anybody have any suggestions on how I should proceed?
These happen randomly, I can run for up to 24 hours with no errors but then they show up in clusters. I upgraded my firmware to SN04 on my Exos drives but the errors continued. I have 2 pools, Midvale and Centerville. The Midvale pool of 5 drives was attached to the motherboard and the Centerville pool of 6 drives in Z2 was attached to the controller card. All the errors happened to the Centerville pool on the controller card. To debug, I swapped the cables so that Centerville is now attached to the motherboard SATA and Riverdale is attached to the controller. Now the errors are happening on the Midvale pool. I think this is good evidence that the problem is the SAS Controller card. As far as I can tell my controller is updated to the latest firmware in IT mode as shown in the sas2flash list output below. Also, I did a surface scan on 3 of the drives, and no errors. I don't believe that I am actually getting corrupted data, just failing responses from the card.Jun 15 17:51:36 nas kernel: zio pool=Riverdale vdev=/dev/disk/by-partuuid/6155bea5-8b54-11ec-a029-a0369f1ff640 error=5 type=1 offset=270336 size=8192 flags=b08c1
Jun 15 17:51:36 nas kernel: sd 0:0:4:0: [sdb] tag#9453 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jun 15 17:51:36 nas kernel: sd 0:0:4:0: [sdb] tag#9453 Sense Key : Not Ready [current]
Jun 15 17:51:36 nas kernel: sd 0:0:4:0: [sdb] tag#9453 Add. Sense: Logical unit not ready, cause not reportable
Jun 15 17:51:36 nas kernel: sd 0:0:4:0: [sdb] tag#9453 CDB: Read(16) 88 00 00 00 00 04 8c 3f fa 90 00 00 00 10 00 00
I am getting a bit desperate. Does anybody have any suggestions on how I should proceed?
Adapter Selected is a LSI SAS: SAS2308_2(D1)
Controller Number : 0
Controller : SAS2308_2(D1)
PCI Address : 00:02:00:00
SAS Address : 500605b-0-08de-5000
NVDATA Version (Default) : 14.01.00.06
NVDATA Version (Persistent) : 14.01.00.06
Firmware Product ID : 0x2214 (IT)
Firmware Version : 20.00.07.00
NVDATA Vendor : LSI
NVDATA Product ID : SAS9207-8i
BIOS Version : 07.39.02.00
UEFI BSD Version : N/A
FCODE Version : N/A
Board Name : SAS9217-8i
Board Assembly : H3-25566-00C
Board Tracer Number : SV41941105