SCSI Status Error

Status
Not open for further replies.

waltw

Cadet
Joined
Apr 24, 2017
Messages
4
This is my first post in the forums and consider myself an advanced novice user.

I have a TS140 system with 4 WD Red 5GB drives in a RAIDZ configuration . I started getting daily email notifications with content similar to the following the day after (March 4, 2017) I upgraded ESXi from 6.0 U2 to 6.5.

Code:
freenas.workgroup kernel log messages:
> da0: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit)
> da1: <ATA WDC WD50EFRX-68M 0A82> Fixed Direct Access SPC-3 SCSI device
> da1: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit)
> da2: <ATA WDC WD50EFRX-68M 0A82> Fixed Direct Access SPC-3 SCSI device
> da2: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit)
> da3: <ATA WDC WD50EFRX-68M 0A82> Fixed Direct Access SPC-3 SCSI device
> da3: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit)
> da4: <ATA WDC WD50EFRX-68M 0A82> Fixed Direct Access SPC-3 SCSI device
> da4: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit)
> pid 8395 (Plex DLNA Server), uid 972: exited on signal 11
> (da2:mpt0:0:2:0): READ(16). CDB: 88 00 00 00 00 01 37 fa 54 c0 00 00 00 58 00 00
> (da2:mpt0:0:2:0): CAM status: SCSI Status Error
> (da2:mpt0:0:2:0): SCSI status: Busy
> (da2:mpt0:0:2:0): Retrying command
> (da1:mpt0:0:1:0): WRITE(16). CDB: 8a 00 00 00 00 01 e0 40 2a 60 00 00 00 10 00 00
> (da1:mpt0:0:1:0): CAM status: SCSI Status Error
> (da1:mpt0:0:1:0): SCSI status: Busy
> (da1:mpt0:0:1:0): Retrying command
> (da1:mpt0:0:1:0): WRITE(16). CDB: 8a 00 00 00 00 01 e0 40 2a 60 00 00 00 10 00 00
> (da1:mpt0:0:1:0): CAM status: SCSI Status Error
> (da1:mpt0:0:1:0): SCSI status: Busy
> (da1:mpt0:0:1:0): Retrying command...

After receiving the first message I receive daily messages similar to this:
Code:
freenas.workgroup kernel log messages:
> (da2:mpt0:0:2:0): READ(16). CDB: 88 00 00 00 00 02 11 3e 2a 40 00 00 00 08 00 00
> (da2:mpt0:0:2:0): CAM status: SCSI Status Error
> (da2:mpt0:0:2:0): SCSI status: Busy
> (da2:mpt0:0:2:0): Retrying command
> (da2:mpt0:0:2:0): READ(16). CDB: 88 00 00 00 00 02 11 3e 2a 40 00 00 00 08 00 00
> (da2:mpt0:0:2:0): CAM status: SCSI Status Error
> (da2:mpt0:0:2:0): SCSI status: Busy
> (da2:mpt0:0:2:0): Retrying command
> (da1:mpt0:0:1:0): READ(16). CDB: 88 00 00 00 00 02 11 33 78 30 00 00 00 08 00 00
> (da1:mpt0:0:1:0): CAM status: SCSI Status Error
> (da1:mpt0:0:1:0): SCSI status: Busy
> (da1:mpt0:0:1:0): Retrying command
> (da1:mpt0:0:1:0): READ(16). CDB: 88 00 00 00 00 02 11 33 78 30 00 00 00 08 00 00
> (da1:mpt0:0:1:0): CAM status: SCSI Status Error
> (da1:mpt0:0:1:0): SCSI status: Busy
> (da1:mpt0:0:1:0): Retrying command...

I have a USB boot drive and an SSD storing the VMs. I believe all my drivers and the BIOS are up to date with the exception of the Intel E1000 NIC driver. The system has 32 GB of RAM and I am running FreeNAS-9.10.2-U3. Also I am running at 92% capacity instead of the recommended 80% (two 8GB external drives are on order to offload some files). I ran the S.M.A.R.T. command on the four WD drives and received no errors. To the best of my knowledge the only change I made at the time was to upgrade ESXi. I have been running the system since without any apparent problems.
 
Last edited by a moderator:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
VMware usually reports Busy status when it has difficulties communicating to the disk and has to tell guest something to avoid timeout. FreeNAS should retry those errors forever without ill effects, but it may be a symptom of some problem.
 

waltw

Cadet
Joined
Apr 24, 2017
Messages
4
I have not figured it out. I upgraded VM compatibility to 6.5 and still have the issues. I am getting errors on all four of my data drives. As I mentioned the log messages had no errors until I upgraded to ESXi 6.5; now get the log with errors daily, The "Unretryable error" just appeared in the most recent log. This is a portion of the log for one drive.

Code:
f
freenas.workgroup kernel log messages:
> (da2:mpt0:0:2:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
> (da2:mpt0:0:2:0): CAM status: SCSI Status Error
> (da2:mpt0:0:2:0): SCSI status: Busy
> (da2:mpt0:0:2:0): Retrying command
> (da2:mpt0:0:2:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
> (da2:mpt0:0:2:0): CAM status: SCSI Status Error
> (da2:mpt0:0:2:0): SCSI status: Busy
> (da2:mpt0:0:2:0): Error 5, Unretryable error
> (da2:mpt0:0:2:0): READ(10). CDB: 28 00 00 40 02 90 00 00 10 00
> (da2:mpt0:0:2:0): CAM status: SCSI Status Error
> (da2:mpt0:0:2:0): SCSI status: Busy
> (da2:mpt0:0:2:0): Retrying command


I have 32GB of ECC ram with 16GB allocated to the FreeNAS VM. I have reduced my data to 83% capacity. I am backing up my data now to two 8GB USB drives in case I need to reinstall the OS or reformat the drives or move to a new platform.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
How are drives shared from ESXi to FreeNAS, current settings?
 

waltw

Cadet
Joined
Apr 24, 2017
Messages
4
How do I get these settings -- from a cmd shell? I use the web console for viewing settings.

Also since my last post I received a CKSUM error (1 error) on one drive in the zpool and I just cleared the error.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
There are several ways to give FreeNAS access to the underlying HDDs.
I'm asking you configured your setup.

My spider senses are tingling - very thin ice I sense.
 

wafliron

Dabbler
Joined
Aug 25, 2016
Messages
13
Did you get any further re figuring this out waltw?

Also, FYI, when Dice asked "How are drives shared from ESXi to FreeNAS, current settings?", he meant how have you presented the hard disks to your FreeNAS VM - did you create ESXi datastores on them and then create VMDK disks on those datastores, which are connected to your FreeNAS VM? Or did you do something else - e.g. RDMs, or attaching the disks to a SAS HBA which is passed directly through to the FreeNAS VM using VMDirectPath I/O?

Anyway - I too have encountered these warnings in my FreeNAS 9.10.2-U4 VM running on ESXi. I previously saw them with 9.10.2-U3 as well - not sure about before that.

Code:
> (da0:mpt0:0:0:0): WRITE(10). CDB: 2a 00 01 e2 5a bb 00 01 00 00
> (da0:mpt0:0:0:0): CAM status: SCSI Status Error
> (da0:mpt0:0:0:0): SCSI status: Busy
> (da0:mpt0:0:0:0): Retrying command
> (da0:mpt0:0:0:0): WRITE(10). CDB: 2a 00 01 e1 c0 48 00 01 00 00
> (da0:mpt0:0:0:0): CAM status: SCSI Status Error
> (da0:mpt0:0:0:0): SCSI status: Busy
> (da0:mpt0:0:0:0): Retrying command
> (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 00 01 e1 c2 48 00 01 00 00
> (da1:mpt0:0:1:0): CAM status: SCSI Status Error
> (da1:mpt0:0:1:0): SCSI status: Busy
> (da1:mpt0:0:1:0): Retrying command
> (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 00 01 e3 ce 76 00 01 00 00
> (da1:mpt0:0:1:0): CAM status: SCSI Status Error
> (da1:mpt0:0:1:0): SCSI status: Busy
> (da1:mpt0:0:1:0): Retrying command
> (da0:mpt0:0:0:0): WRITE(10). CDB: 2a 00 01 e3 ce 76 00 01 00 00
> (da0:mpt0:0:0:0): CAM status: SCSI Status Error
> (da0:mpt0:0:0:0): SCSI status: Busy
> (da0:mpt0:0:0:0): Retrying command


I'm confident the problem only appeared since upgrading ESXi to v6.5 from v6.0 (U2 I think) (only noticed it for the first time since upgrading), and it only affects two of my "disks" which are actually VMDKs sitting on ESXi datastores (with the underlying "disks" being two very old 32GB SSDs). Those two "disks" comprise my mirrored FreeNAS boot volume. I haven't noticed similar warnings for any of my other disks - three larger / newer SSDs and six 4TB WD Reds - which are connected to two SAS HBAs that are directly passed through to the FreeNAS VM via VMDirectPath I/O.

Interestingly, I believe I'm only ever seeing these warnings during the process of updating FreeNAS to a new version - or possibly only during the FreeNAS reboot process regardless of whether I'm installing an update (pretty much the only time I ever reboot it is during an update). It is possible I'm receiving them at other times and just not noticing - although I believe FreeNAS would be e-mailing me about the warnings if that was the case, and I'm not receiving any such e-mails. The warnings appear to be just that so far - warnings - and haven't caused any actual problems that I've noticed (I assume retrying the SCSI commands was successful).

My speculation is that ESXi 6.5 introduced some sort of incompatibility / issue with FreeNAS 9.10.2's paravirtualised SCSI driver (or similar) that only manifests itself during periods of heavy disk activity, causing these SCSI timeouts / errors. In my case, this would explain why I'm only receiving the warnings when installing FreeNAS updates (only time my boot volumes would ever receive any sort of significant write workload, I'd think), and only seeing the warnings for disks that are VMDKs sitting on ESXi datastores.

Anyone have any thoughts on the above hypothesis? Or can anyone else corroborate the exact behaviour I'm seeing?
 
Status
Not open for further replies.
Top