Another DEGRADED array, SCSI errors thread

Status
Not open for further replies.

trashlab

Cadet
Joined
Nov 9, 2018
Messages
1
Hi all,

I am brand new to FreeNAS as of today! Have wanted to get my feet wet in the homelab world for a while and the parts have finally arrived.

Here's the hardware:
Dell Precision T7600 chassis
2x E5-2630 (6 core, 2.3ghz)
96GB DDR3 ECC
PERC H310 flashed to IT mode following this guide: https://techmattr.wordpress.com/201...-flashing-to-it-mode-dell-perc-h200-and-h310/
4x WD Red 3TB, using the bays in the front of the machine.

I currently have FreeNAS 11.1 U6 running in a VM under Proxmox VE. It has the H310 passthru'd to it and can see all four disks just fine. I create a volume using the four disks as ZRAID, create a samba share, and start copying some stuff over. This is where the problems arise.

Here is the zpool output:
Code:
  pool: plzwork																													
 state: DEGRADED																													
status: One or more devices are faulted in response to persistent errors.														
	   Sufficient replicas exist for the pool to continue functioning in a														
	   degraded state.																											
action: Replace the faulted device, or use 'zpool clear' to mark the device														
	   repaired.																												
  scan: resilvered 69.6M in 0 days 00:00:08 with 0 errors on Fri Nov  9 21:56:58 2018											
config:																															
																																
	   NAME											STATE	 READ WRITE CKSUM												
	   plzwork										 DEGRADED	 0	 0	 0												
		 raidz1-0									  DEGRADED	 0	 0	 0												
		   gptid/49197dbc-e49c-11e8-94c9-d5c87de5f925  ONLINE	   0	 0	 0												
		   gptid/4a01851a-e49c-11e8-94c9-d5c87de5f925  ONLINE	   0	 0	 0												
		   gptid/4af7694d-e49c-11e8-94c9-d5c87de5f925  ONLINE	   0	 0	 0												
		   gptid/4bdfefa3-e49c-11e8-94c9-d5c87de5f925  FAULTED	  0 11.2K	 0  too many errors								
																																
errors: No known data errors


The errors on the console all resemble
Code:
(da3:isci0:0:3:0): CAM status: SCSI Status Error

or
Code:
(da3:isci0:0:3:0): WRITE(10). CDB: 2a 00 00 54 ab 90 00 01 00 00
(da3:isci0:0:3:0): CAM status: CCB request terminated by the host
(da3:isci0:0:3:0): Retrying command

among others that I can't reproduce.

All four disks are the same age (~2 years) and model. The only difference between the four is the power on hours - I had these disks in a different machine for about a year but kept one out with the idea of it being a spare, but eventually just used it in the array as well. This accounts for a ~2500 hour difference.

Interestingly enough, the drive with the lower hour count is the one with all the errors. I have attached its smart output, hopefully someone with a trained eye can point out issues there. I see quite a few errors. It has been in every slot possible, to try and eliminate cabling issues. The drives are mounted in the dell caddies with rubber isolation pads. I have also plugged the drives into the onboard SAS controller, yielding the same results.

I have read quite a few other threads about this and it seems as though there is not a common consensus on what causes this, although some point the finger at the drive itself which was my first thought. This happens to this one drive only.

Hopefully someone can provide some insight into this! Thank you all in advance!
 

Attachments

  • smartctl_da3.txt
    23.6 KB · Views: 241

rvassar

Guru
Joined
May 2, 2018
Messages
972
You could pull the drive, and run badblocks or another test against it for a time, and see if it clears up. But... It looks like a prematurely failing disk. It happens... RMA it and be done...
 
Status
Not open for further replies.
Top