Another DEGRADED array, SCSI errors thread

trashlab · Nov 9, 2018

Hi all,

I am brand new to FreeNAS as of today! Have wanted to get my feet wet in the homelab world for a while and the parts have finally arrived.

Here's the hardware:
Dell Precision T7600 chassis
2x E5-2630 (6 core, 2.3ghz)
96GB DDR3 ECC
PERC H310 flashed to IT mode following this guide: https://techmattr.wordpress.com/201...-flashing-to-it-mode-dell-perc-h200-and-h310/
4x WD Red 3TB, using the bays in the front of the machine.

I currently have FreeNAS 11.1 U6 running in a VM under Proxmox VE. It has the H310 passthru'd to it and can see all four disks just fine. I create a volume using the four disks as ZRAID, create a samba share, and start copying some stuff over. This is where the problems arise.

Here is the zpool output:

Code:

  pool: plzwork																													
 state: DEGRADED																													
status: One or more devices are faulted in response to persistent errors.														
	   Sufficient replicas exist for the pool to continue functioning in a														
	   degraded state.																											
action: Replace the faulted device, or use 'zpool clear' to mark the device														
	   repaired.																												
  scan: resilvered 69.6M in 0 days 00:00:08 with 0 errors on Fri Nov  9 21:56:58 2018											
config:																															
																																
	   NAME											STATE	 READ WRITE CKSUM												
	   plzwork										 DEGRADED	 0	 0	 0												
		 raidz1-0									  DEGRADED	 0	 0	 0												
		   gptid/49197dbc-e49c-11e8-94c9-d5c87de5f925  ONLINE	   0	 0	 0												
		   gptid/4a01851a-e49c-11e8-94c9-d5c87de5f925  ONLINE	   0	 0	 0												
		   gptid/4af7694d-e49c-11e8-94c9-d5c87de5f925  ONLINE	   0	 0	 0												
		   gptid/4bdfefa3-e49c-11e8-94c9-d5c87de5f925  FAULTED	  0 11.2K	 0  too many errors								
																																
errors: No known data errors

The errors on the console all resemble

Code:

(da3:isci0:0:3:0): CAM status: SCSI Status Error

or

Code:

(da3:isci0:0:3:0): WRITE(10). CDB: 2a 00 00 54 ab 90 00 01 00 00
(da3:isci0:0:3:0): CAM status: CCB request terminated by the host
(da3:isci0:0:3:0): Retrying command

among others that I can't reproduce.

All four disks are the same age (~2 years) and model. The only difference between the four is the power on hours - I had these disks in a different machine for about a year but kept one out with the idea of it being a spare, but eventually just used it in the array as well. This accounts for a ~2500 hour difference.

Interestingly enough, the drive with the lower hour count is the one with all the errors. I have attached its smart output, hopefully someone with a trained eye can point out issues there. I see quite a few errors. It has been in every slot possible, to try and eliminate cabling issues. The drives are mounted in the dell caddies with rubber isolation pads. I have also plugged the drives into the onboard SAS controller, yielding the same results.

I have read quite a few other threads about this and it seems as though there is not a common consensus on what causes this, although some point the finger at the drive itself which was my first thought. This happens to this one drive only.

Hopefully someone can provide some insight into this! Thank you all in advance!

rvassar · Nov 9, 2018

You could pull the drive, and run badblocks or another test against it for a time, and see if it clears up. But... It looks like a prematurely failing disk. It happens... RMA it and be done...

Johnnie Black · Nov 10, 2018

It appears to be a disk problem, there's an UNC @ LBA error, aka read error.

Important Announcement for the TrueNAS Community.

Another DEGRADED array, SCSI errors thread

trashlab

Cadet

Attachments

rvassar

Guru

Johnnie Black

Guru

Similar threads

Important Announcement for the TrueNAS Community.

Another DEGRADED array, SCSI errors thread

trashlab

Cadet

Attachments

rvassar

Guru

Johnnie Black

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Another DEGRADED array, SCSI errors thread"

Similar threads