Pool degraded - failed to read SMART Attribute Data

Status
Not open for further replies.

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
FreeNAS 11.1-U3
Supermicro X11SSL-CF with six Toshiba DT01ACA3 drives connected to the LSI 3008 (FW PH15.00.03.00).

I got a critical alert today:
Code:
CRITICAL: March 24, 2018, 8:52 a.m. - Device: /dev/da3 [SAT], failed to read SMART Attribute Data
CRITICAL: March 24, 2018, 8:48 a.m. - The volume pool1 state is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.

During the initial testing and setup a couple of days ago I've received the same alert for /dev/da1. Both then and now, the drives respond to SMART reads and show good health. I've since (after the initial testing and /dev/da1 error) destroyed the pool and did a fresh install and setup, but now the same thing happened with /dev/da3.

What could be the issue? I'm setting this NAS up to take over the data role for the whole office and need it to be reliable.

Code:
Mar 24 08:47:34 FreeNAS	(da3:mpr0:0:5:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 343 Aborting command 0xfffffe0000ee1d10
Mar 24 08:47:34 FreeNAS mpr0: Sending reset from mprsas_send_abort for target ID 5
Mar 24 08:47:34 FreeNAS smartd[2977]: Device: /dev/da3 [SAT], failed to read SMART Attribute Data
Mar 24 08:47:35 FreeNAS	(pass3:mpr0:0:5:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 374 terminated ioc 804b loginfo 31130000 scsi 0 state c xfer 0
Mar 24 08:47:35 FreeNAS mpr0: Unfreezing devq for target ID 5
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): CAM status: Command timeout
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): Retrying command
Mar 24 08:47:35 FreeNAS ZFS: vdev state changed, pool_guid=5585696111870732690 vdev_guid=14774311635562606667
Mar 24 08:47:35 FreeNAS ZFS: vdev state changed, pool_guid=5585696111870732690 vdev_guid=14774311635562606667
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): CAM status: SCSI Status Error
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): SCSI status: Check Condition
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): Error 6, Retries exhausted
Mar 24 08:47:35 FreeNAS (da3:mpr0:0:5:0): Invalidating pack
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_write_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[WRITE(offset=249856, length=4096)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_write_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[WRITE(offset=512000, length=4096)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=270336, length=8192)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=2998444761088, length=8192)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=2998445023232, length=8192)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_write_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[WRITE(offset=2998445002752, length=4096)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_write_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[WRITE(offset=2998445264896, length=4096)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=270336, length=8192)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=2998444761088, length=8192)]
Mar 24 08:47:35 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=2998445023232, length=8192)]

Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)															
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org														
																																
=== START OF INFORMATION SECTION ===																							
Model Family:	 Toshiba 3.5" DT01ACA... Desktop HDD																			
Device Model:	 TOSHIBA DT01ACA300																							
Serial Number:	55C5P15GS																										
LU WWN Device Id: 5 000039 fe3c294d4																							
Firmware Version: MX6OABB0																										
User Capacity:	3,000,592,982,016 bytes [3.00 TB]																				
Sector Sizes:	 512 bytes logical, 4096 bytes physical																		
Rotation Rate:	7200 rpm																										
Form Factor:	  3.5 inches																									
Device is:		In smartctl database [for details use: -P show]																
ATA Version is:   ATA8-ACS T13/1699-D revision 4																				
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)																		
Local Time is:	Sat Mar 24 15:10:38 2018 CET																					
SMART support is: Available - device has SMART capability.																		
SMART support is: Enabled																										
																																
=== START OF READ SMART DATA SECTION ===																						
SMART overall-health self-assessment test result: PASSED																		
																																
General SMART Values:																											
Offline data collection status:  (0x84) Offline data collection activity														
										was suspended by an interrupting command from host.										
										Auto Offline Data Collection: Enabled.													
Self-test execution status:	  (   0) The previous self-test routine completed												
										without error or no self-test has ever													
										been run.																				
Total time to complete Offline																									
data collection:				(22078) seconds.																				
Offline data collection																											
capabilities:					(0x5b) SMART execute Offline immediate.														
										Auto Offline data collection on/off support.											
										Suspend Offline collection upon new														
										command.																				
										Offline surface scan supported.															
										Self-test supported.																	
										No Conveyance Self-test supported.														
										Selective Self-test supported.															
SMART capabilities:			(0x0003) Saves SMART data before entering														
										power-saving mode.																		
										Supports SMART auto save timer.															
Error logging capability:		(0x01) Error logging supported.																
										General Purpose Logging supported.														
Short self-test routine																											
recommended polling time:		(   1) minutes.																				
Extended self-test routine							
recommended polling time:		( 368) minutes.																				
SCT capabilities:			  (0x003d) SCT Status supported.																	
										SCT Error Recovery Control supported.													
										SCT Feature Control supported.															
										SCT Data Table supported.																
																																
SMART Attributes Data Structure revision number: 16																				
Vendor Specific SMART Attributes with Thresholds:																				
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE								
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0										
  2 Throughput_Performance  0x0005   140   140   054	Pre-fail  Offline	  -	   69										
  3 Spin_Up_Time			0x0007   134   134   024	Pre-fail  Always	   -	   428 (Average 430)						
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   88										
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0										
  7 Seek_Error_Rate		 0x000b   100   100   067	Pre-fail  Always	   -	   0										
  8 Seek_Time_Performance   0x0005   124   124   020	Pre-fail  Offline	  -	   33										
  9 Power_On_Hours		  0x0012   100   100   000	Old_age   Always	   -	   491										
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0										
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   88										
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   98										
193 Load_Cycle_Count		0x0012   100   100   000	Old_age   Always	   -	   98										
194 Temperature_Celsius	 0x0002   142   142   000	Old_age   Always	   -	   42 (Min/Max 17/50)						
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0										
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0										
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0										
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0										
																																
SMART Error Log Version: 1																										
No Errors Logged																												
																																
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Completed without error	   00%	   239		 -													
# 2  Short offline	   Completed without error	   00%	   234		 -													
# 3  Extended offline	Completed without error	   00%		92		 -													
# 4  Short offline	   Completed without error	   00%		86		 -													
# 5  Short offline	   Completed without error	   00%		39		 -													
																																
SMART Selective self-test log data structure revision number 1																	
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																					
	1		0		0  Not_testing																							
	2		0		0  Not_testing																							
	3		0		0  Not_testing																							
	4		0		0  Not_testing																							
	5		0		0  Not_testing																							
Selective self-test flags (0x0):																								
  After scanning selected spans, do NOT read-scan remainder of disk.															
If Selective self-test is pending on power-up, resume after 0 minute delay.	

Code:
[root@FreeNAS ~]# zpool status																								
  pool: pool1																												  
 state: DEGRADED																												  
status: One or more devices are faulted in response to persistent errors.														  
		Sufficient replicas exist for the pool to continue functioning in a														
		degraded state.																											
action: Replace the faulted device, or use 'zpool clear' to mark the device														
		repaired.																												  
  scan: none requested																											
config:																															
																																  
		NAME												STATE	 READ WRITE CKSUM											
		pool1											   DEGRADED	 0	 0	 0											
		  mirror-0										  ONLINE	   0	 0	 0											
			gptid/37303cdf-2e8e-11e8-9e93-0cc47a868d02.eli  ONLINE	   0	 0	 0											
			gptid/3a7cdb6d-2e8e-11e8-9e93-0cc47a868d02.eli  ONLINE	   0	 0	 0											
		  mirror-1										  DEGRADED	 0	 0	 0											
			gptid/40bcca1a-2e8e-11e8-9e93-0cc47a868d02.eli  ONLINE	   0	 0	 0											
			gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli  FAULTED	  6	 4	 0  too many errors							
		  mirror-2										  ONLINE	   0	 0	 0											
			gptid/4a6bd536-2e8e-11e8-9e93-0cc47a868d02.eli  ONLINE	   0	 0	 0											
			gptid/4dcdc4cb-2e8e-11e8-9e93-0cc47a868d02.eli  ONLINE	   0	 0	 0											
																																  
errors: No known data errors																									  
																																  
  pool: freenas-boot																											  
 state: ONLINE																													
  scan: none requested																											
config:																															
																																  
		NAME		STATE	 READ WRITE CKSUM																					
		freenas-boot  ONLINE	   0	 0	 0																				  
		  ada0p2	ONLINE	   0	 0	 0																					
																																  
errors: No known data errors				  
 
Last edited:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Your drives run too hot, you must improve the cooling to keep them under 40 °C.

It may be the source of your problem but I doubt it as usually the errors start at 50-55 °C.
 
Last edited by a moderator:

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
Your drives run too hot, you must improve the cooling to keep them under 40 °C.

It's may be the source of your problem but I doubt it as usually the errors start at 50-55 °C.
I don't believe that is the cause as they are working at 40-46 °C these days and have a stated operating temperature of up to 60 °C.
 
Last edited:

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
In the meantime, what would be the procedure to mark the drive as ok and have it go back online? The pool is encrypted if that matters.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
have a stated operating temperature of up to 60 °C.
That is the "nope, your warranty is void!" temperature, not even close to a reasonable maximum.
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
That is the "nope, your warranty is void!" temperature, not even close to a reasonable maximum.
Oh, it's not that I'm planning to go anywhere near that. I will be running them in the 30-35 °C range in the final location. They are currently going up to 46 in the "preparation room" and 50 were peak temps for a brief time during the initial stress testing.

Any tips on how to properly bring the drive back from the FAULTED state and/or how to go about figuring what the problem with the SMART reads is?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm not convinced the drive is functional (not because of the temperature, just the errors in general). What's the output of smartctl -x /dev/da3?
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
I'm not convinced the drive is functional (not because of the temperature, just the errors in general). What's the output of smartctl -x /dev/da3?
Here it is.

btw Is there an easier way to copy&paste these long reports than using "smartctl -x /dev/da3 | more" and doing it section by section?

Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)															 
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org														 
																																   
=== START OF INFORMATION SECTION ===																							   
Model Family:	 Toshiba 3.5" DT01ACA... Desktop HDD																			   
Device Model:	 TOSHIBA DT01ACA300																							   
Serial Number:	55C5P15GS																										 
LU WWN Device Id: 5 000039 fe3c294d4																							   
Firmware Version: MX6OABB0																										 
User Capacity:	3,000,592,982,016 bytes [3.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																		   
Rotation Rate:	7200 rpm																										 
Form Factor:	  3.5 inches																									   
Device is:		In smartctl database [for details use: -P show]																   
ATA Version is:   ATA8-ACS T13/1699-D revision 4																				   
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)																		   
Local Time is:	Sat Mar 24 21:29:23 2018 CET																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
AAM feature is:   Unavailable																									   
APM feature is:   Disabled																										 
Rd look-ahead is: Enabled																										   
Write cache is:   Enabled																										   
DSN feature is:   Unavailable																									   
ATA Security is:  Disabled, NOT FROZEN [SEC1]																					   
Wt Cache Reorder: Enabled																										   
																																   
=== START OF READ SMART DATA SECTION ===																						   
SMART overall-health self-assessment test result: PASSED																		   
																																   
General SMART Values:																											   
Offline data collection status:  (0x82) Offline data collection activity														   
										was completed without error.															   
										Auto Offline Data Collection: Enabled.													 
Self-test execution status:	  (   0) The previous self-test routine completed												   
										without error or no self-test has ever													 
										been run.																				   
Total time to complete Offline																									 
data collection:				(22078) seconds.																				   
Offline data collection																											 
capabilities:					(0x5b) SMART execute Offline immediate.														   
										Auto Offline data collection on/off support.											   
										Suspend Offline collection upon new														 
										command.																				   
										Offline surface scan supported.															 
										Self-test supported.																	   
										No Conveyance Self-test supported.														 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering   
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																   
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   1) minutes.																				   
Extended self-test routine																										 
recommended polling time:		( 368) minutes.																				   
SCT capabilities:			  (0x003d) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																   
SMART Attributes Data Structure revision number: 16																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAGS	VALUE WORST THRESH FAIL RAW_VALUE															 
  1 Raw_Read_Error_Rate	 PO-R--   100   100   016	-	0																	 
  2 Throughput_Performance  P-S---   140   140   054	-	69																	 
  3 Spin_Up_Time			POS---   134   134   024	-	428 (Average 430)													 
  4 Start_Stop_Count		-O--C-   100   100   000	-	88																	 
  5 Reallocated_Sector_Ct   PO--CK   100   100   005	-	0																	 
  7 Seek_Error_Rate		 PO-R--   100   100   067	-	0																	 
  8 Seek_Time_Performance   P-S---   124   124   020	-	33																	 
  9 Power_On_Hours		  -O--C-   100   100   000	-	498																   
 10 Spin_Retry_Count		PO--C-   100   100   060	-	0																	 
 12 Power_Cycle_Count	   -O--CK   100   100   000	-	88																	 
192 Power-Off_Retract_Count -O--CK   100   100   000	-	99																	 
193 Load_Cycle_Count		-O--C-   100   100   000	-	99																	 
194 Temperature_Celsius	 -O----   153   153   000	-	39 (Min/Max 17/50)													 
196 Reallocated_Event_Count -O--CK   100   100   000	-	0																	 
197 Current_Pending_Sector  -O---K   100   100   000	-	0																	 
198 Offline_Uncorrectable   ---R--   100   100   000	-	0																	 
199 UDMA_CRC_Error_Count	-O-R--   200   200   000	-	0																	 
							||||||_ K auto-keep																					 
							|||||__ C event count																				   
							||||___ R error rate																				   
							|||____ S speed/performance																			 
							||_____ O updated online																			   
							|______ P prefailure warning																		   
																																   
General Purpose Log Directory Version 1																							 
SMART		   Log Directory Version 1 [multi-sector log support]																 
Address	Access  R/W   Size  Description																						 
0x00	   GPL,SL  R/O	  1  Log Directory																					   
0x01		   SL  R/O	  1  Summary SMART error log																			 
0x03	   GPL	 R/O	  1  Ext. Comprehensive SMART error log																   
0x04	   GPL	 R/O	  7  Device Statistics log																			   
0x06		   SL  R/O	  1  SMART self-test log																				 
0x07	   GPL	 R/O	  1  Extended self-test log		 
0x08	   GPL	 R/O	  2  Power Conditions log																				 
0x09		   SL  R/W	  1  Selective self-test log																			 
0x10	   GPL	 R/O	  1  NCQ Command Error log																			   
0x11	   GPL	 R/O	  1  SATA Phy Event Counters log																		 
0x20	   GPL	 R/O	  1  Streaming performance log [OBS-8]																   
0x21	   GPL	 R/O	  1  Write stream error log																			   
0x22	   GPL	 R/O	  1  Read stream error log																			   
0x80-0x9f  GPL,SL  R/W	 16  Host vendor specific log																			 
0xe0	   GPL,SL  R/W	  1  SCT Command/Status																				   
0xe1	   GPL,SL  R/W	  1  SCT Data Transfer																				   
																																   
SMART Extended Comprehensive Error Log Version: 1 (1 sectors)																	   
No Errors Logged																												   
																																   
SMART Extended Self-test Log Version: 1 (1 sectors)																				 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Completed without error	   00%	   239		 -													 
# 2  Short offline	   Completed without error	   00%	   234		 -													 
# 3  Extended offline	Completed without error	   00%		92		 -													 
# 4  Short offline	   Completed without error	   00%		86		 -													 
# 5  Short offline	   Completed without error	   00%		39		 -													 
																																   
SMART Selective self-test log data structure revision number 1																	 
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																					   
	1		0		0  Not_testing																							   
	2		0		0  Not_testing																							   
	3		0		0  Not_testing																							   
	4		0		0  Not_testing																							   
	5		0		0  Not_testing																							   
Selective self-test flags (0x0):																								   
  After scanning selected spans, do NOT read-scan remainder of disk.															   
If Selective self-test is pending on power-up, resume after 0 minute delay.														 
																																   
SCT Status Version:				  3																							 
SCT Version (vendor specific):	   256 (0x0100)																				   
SCT Support Level:				   1																							 
Device State:						Active (0)																					 
Current Temperature:					39 Celsius																				 
Power Cycle Min/Max Temperature:	 39/45 Celsius																				 
Lifetime	Min/Max Temperature:	 17/50 Celsius																				 
Under/Over Temperature Limit Count:   0/0																						   
																																   
SCT Temperature History Version:	 2																							 
Temperature Sampling Period:		 1 minute																					   
Temperature Logging Interval:		1 minute																					   
Min/Max recommended Temperature:	  0/60 Celsius																				 
Min/Max Temperature Limit:		   -40/70 Celsius																				 
Temperature History Size (Index):	128 (88)	 
Index	Estimated Time   Temperature Celsius																					   
  89	2018-03-24 19:24	39  ********************																			   
 ...	..(126 skipped).	..  ********************																			   
  88	2018-03-24 21:31	39  ********************																			   
																																   
SCT Error Recovery Control:																										 
		   Read: Disabled																										   
		  Write: Disabled																										   
																																   
Device Statistics (GP Log 0x04)																									 
Page  Offset Size		Value Flags Description																				   
0x01  =====  =			   =  ===  == General Statistics (rev 1) ==															   
0x01  0x008  4			  88  ---  Lifetime Power-On Resets																	   
0x01  0x010  4			 498  ---  Power-on Hours																				 
0x01  0x018  6	 76298045938  ---  Logical Sectors Written																	   
0x01  0x020  6	   251060527  ---  Number of Write Commands																	   
0x01  0x028  6	 66137872954  ---  Logical Sectors Read																		   
0x01  0x030  6	   197944476  ---  Number of Read Commands																	   
0x03  =====  =			   =  ===  == Rotating Media Statistics (rev 1) ==													   
0x03  0x008  4			 498  ---  Spindle Motor Power-on Hours																   
0x03  0x010  4			 498  ---  Head Flying Hours																			 
0x03  0x018  4			  99  ---  Head Load Events																			   
0x03  0x020  4			   0  ---  Number of Reallocated Logical Sectors														 
0x03  0x028  4			   1  ---  Read Recovery Attempts																		 
0x03  0x030  4			   6  ---  Number of Mechanical Start Failures														   
0x04  =====  =			   =  ===  == General Errors Statistics (rev 1) ==													   
0x04  0x008  4			   0  ---  Number of Reported Uncorrectable Errors													   
0x04  0x010  4			   0  ---  Resets Between Cmd Acceptance and Completion												   
0x05  =====  =			   =  ===  == Temperature Statistics (rev 1) ==														   
0x05  0x008  1			  39  ---  Current Temperature																		   
0x05  0x010  1			  41  N--  Average Short Term Temperature																 
0x05  0x018  1			  40  N--  Average Long Term Temperature																 
0x05  0x020  1			  50  ---  Highest Temperature																		   
0x05  0x028  1			  17  ---  Lowest Temperature																			 
0x05  0x030  1			  47  N--  Highest Average Short Term Temperature														 
0x05  0x038  1			  25  N--  Lowest Average Short Term Temperature														 
0x05  0x040  1			  45  N--  Highest Average Long Term Temperature														 
0x05  0x048  1			  25  N--  Lowest Average Long Term Temperature														   
0x05  0x050  4			   0  ---  Time in Over-Temperature																	   
0x05  0x058  1			  60  ---  Specified Maximum Operating Temperature													   
0x05  0x060  4			   0  ---  Time in Under-Temperature																	 
0x05  0x068  1			   0  ---  Specified Minimum Operating Temperature													   
0x06  =====  =			   =  ===  == Transport Statistics (rev 1) ==															 
0x06  0x008  4			 393  ---  Number of Hardware Resets																	 
0x06  0x010  4			 142  ---  Number of ASR Events																		   
0x06  0x018  4			   0  ---  Number of Interface CRC Errors																 
								|||_ C monitored condition met																	 
								||__ D supports DSN																				 
								|___ N normalized value			 
0x01  0x030  6	   197944476  ---  Number of Read Commands																	   
0x03  =====  =			   =  ===  == Rotating Media Statistics (rev 1) ==													   
0x03  0x008  4			 498  ---  Spindle Motor Power-on Hours																   
0x03  0x010  4			 498  ---  Head Flying Hours																			 
0x03  0x018  4			  99  ---  Head Load Events																			   
0x03  0x020  4			   0  ---  Number of Reallocated Logical Sectors														 
0x03  0x028  4			   1  ---  Read Recovery Attempts																		 
0x03  0x030  4			   6  ---  Number of Mechanical Start Failures														   
0x04  =====  =			   =  ===  == General Errors Statistics (rev 1) ==													   
0x04  0x008  4			   0  ---  Number of Reported Uncorrectable Errors													   
0x04  0x010  4			   0  ---  Resets Between Cmd Acceptance and Completion												   
0x05  =====  =			   =  ===  == Temperature Statistics (rev 1) ==														   
0x05  0x008  1			  39  ---  Current Temperature																		   
0x05  0x010  1			  41  N--  Average Short Term Temperature																 
0x05  0x018  1			  40  N--  Average Long Term Temperature																 
0x05  0x020  1			  50  ---  Highest Temperature																		   
0x05  0x028  1			  17  ---  Lowest Temperature																			 
0x05  0x030  1			  47  N--  Highest Average Short Term Temperature														 
0x05  0x038  1			  25  N--  Lowest Average Short Term Temperature														 
0x05  0x040  1			  45  N--  Highest Average Long Term Temperature														 
0x05  0x048  1			  25  N--  Lowest Average Long Term Temperature														   
0x05  0x050  4			   0  ---  Time in Over-Temperature																	   
0x05  0x058  1			  60  ---  Specified Maximum Operating Temperature													   
0x05  0x060  4			   0  ---  Time in Under-Temperature																	 
0x05  0x068  1			   0  ---  Specified Minimum Operating Temperature													   
0x06  =====  =			   =  ===  == Transport Statistics (rev 1) ==															 
0x06  0x008  4			 393  ---  Number of Hardware Resets																	 
0x06  0x010  4			 142  ---  Number of ASR Events																		   
0x06  0x018  4			   0  ---  Number of Interface CRC Errors																 
								|||_ C monitored condition met																	 
								||__ D supports DSN																				 
								|___ N normalized value																			 
																																   
Pending Defects log (GP Log 0x0c) not supported																					 
																																   
SATA Phy Event Counters (GP Log 0x11)																							   
ID	  Size	 Value  Description																								 
0x0001  2			0  Command failed due to ICRC error																		   
0x0002  2			0  R_ERR response for data FIS																				 
0x0003  2			0  R_ERR response for device-to-host data FIS																 
0x0004  2			0  R_ERR response for host-to-device data FIS																 
0x0005  2			0  R_ERR response for non-data FIS																			 
0x0006  2			0  R_ERR response for device-to-host non-data FIS															 
0x0007  2			0  R_ERR response for host-to-device non-data FIS															 
0x0009  2		   14  Transition from drive PhyRdy to drive PhyNRdy															   
0x000a  2		   15  Device-to-host register FISes sent due to a COMRESET													   
0x000b  2			0  CRC errors within host-to-device FIS																	   
0x000d  2			0  Non-CRC errors within host-to-device FIS	 
0x01  0x030  6	   197944476  ---  Number of Read Commands																	   
0x03  =====  =			   =  ===  == Rotating Media Statistics (rev 1) ==													   
0x03  0x008  4			 498  ---  Spindle Motor Power-on Hours																   
0x03  0x010  4			 498  ---  Head Flying Hours																			 
0x03  0x018  4			  99  ---  Head Load Events																			   
0x03  0x020  4			   0  ---  Number of Reallocated Logical Sectors														 
0x03  0x028  4			   1  ---  Read Recovery Attempts																		 
0x03  0x030  4			   6  ---  Number of Mechanical Start Failures														   
0x04  =====  =			   =  ===  == General Errors Statistics (rev 1) ==													   
0x04  0x008  4			   0  ---  Number of Reported Uncorrectable Errors													   
0x04  0x010  4			   0  ---  Resets Between Cmd Acceptance and Completion												   
0x05  =====  =			   =  ===  == Temperature Statistics (rev 1) ==														   
0x05  0x008  1			  39  ---  Current Temperature																		   
0x05  0x010  1			  41  N--  Average Short Term Temperature																 
0x05  0x018  1			  40  N--  Average Long Term Temperature																 
0x05  0x020  1			  50  ---  Highest Temperature																		   
0x05  0x028  1			  17  ---  Lowest Temperature																			 
0x05  0x030  1			  47  N--  Highest Average Short Term Temperature														 
0x05  0x038  1			  25  N--  Lowest Average Short Term Temperature														 
0x05  0x040  1			  45  N--  Highest Average Long Term Temperature														 
0x05  0x048  1			  25  N--  Lowest Average Long Term Temperature														   
0x05  0x050  4			   0  ---  Time in Over-Temperature																	   
0x05  0x058  1			  60  ---  Specified Maximum Operating Temperature													   
0x05  0x060  4			   0  ---  Time in Under-Temperature																	 
0x05  0x068  1			   0  ---  Specified Minimum Operating Temperature													   
0x06  =====  =			   =  ===  == Transport Statistics (rev 1) ==															 
0x06  0x008  4			 393  ---  Number of Hardware Resets																	 
0x06  0x010  4			 142  ---  Number of ASR Events																		   
0x06  0x018  4			   0  ---  Number of Interface CRC Errors																 
								|||_ C monitored condition met																	 
								||__ D supports DSN																				 
								|___ N normalized value																			 
																																   
Pending Defects log (GP Log 0x0c) not supported																					 
																																   
SATA Phy Event Counters (GP Log 0x11)																							   
ID	  Size	 Value  Description																								 
0x0001  2			0  Command failed due to ICRC error																		   
0x0002  2			0  R_ERR response for data FIS																				 
0x0003  2			0  R_ERR response for device-to-host data FIS																 
0x0004  2			0  R_ERR response for host-to-device data FIS																 
0x0005  2			0  R_ERR response for non-data FIS																			 
0x0006  2			0  R_ERR response for device-to-host non-data FIS															 
0x0007  2			0  R_ERR response for host-to-device non-data FIS															 
0x0009  2		   14  Transition from drive PhyRdy to drive PhyNRdy															   
0x000a  2		   15  Device-to-host register FISes sent due to a COMRESET													   
0x000b  2			0  CRC errors within host-to-device FIS																	   
0x000d  2			0  Non-CRC errors within host-to-device FIS		 
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
btw Is there an easier way to copy&paste these long reports than using "smartctl -x /dev/da3 | more" and doing it section by section?
SSH into your server, then (if your SSH client is even remotely decent) you'll have a scrollback buffer. Then you can copy/paste everything in one block.

I don't see obvious problems other than temperature in the output you posted, but you do need to set up regular SMART self-tests.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
0x03 0x030 4 6 --- Number of Mechanical Start Failures
Doesn't sound good. And it correlates with the number of read errors...

Keep an eye on this drive. For now, use zpool clear.

Sidenote: I like all the detailed data this drive shows. Pretty cool.
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
I don't see obvious problems other than temperature in the output you posted, but you do need to set up regular SMART self-tests.
I've set up long test twice a month and short test twice a week.


Doesn't sound good. And it correlates with the number of read errors...

Keep an eye on this drive. For now, use zpool clear.

Sidenote: I like all the detailed data this drive shows. Pretty cool.

It surprised me :)

I've found similar problems and reported bugs:
https://forums.freenas.org/index.php?threads/false-smart-errors.60608/
https://redmine.ixsystems.com/issues/28235
https://redmine.ixsystems.com/issues/28201
https://forums.freenas.org/index.php?threads/read-smart-error-log-failed-after-every-test.35601/

Not sure if the problem is the same and I don't see any mps0 chain frames errors in my log.

Kind of worried that the error pops up for different disks.

One thing that crossed my mind is the FW/driver relationship for LSI controllers where the driver is one version number ahead.

I have:
mpr0: Firmware: 15.00.03.00, Driver: 15.03.00.00-fbsd

Could this be a possible cause? Should I downgrade the SAS FW?
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
zpool clear produces the following in the console:
Code:
Mar 25 00:35:57 FreeNAS ZFS: vdev state changed, pool_guid=5585696111870732690 vdev_guid=13923756651256524743
Mar 25 00:35:57 FreeNAS ZFS: vdev state changed, pool_guid=5585696111870732690 vdev_guid=14774311635562606667
Mar 25 00:35:57 FreeNAS ZFS: vdev state changed, pool_guid=5585696111870732690 vdev_guid=14774311635562606667
Mar 25 00:35:58 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=270336, length=8192)]
Mar 25 00:35:58 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=2998444761088, length=8192)]
Mar 25 00:35:58 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/441118bb-2e8e-11e8-9e93-0cc47a868d02.eli[READ(offset=2998445023232, length=8192)]

The drive is still faulted. Should I OFFLINE the drive and then replace it with itself? If so, do I have to follow the encrypted drive replacement procedure even if it is the same drive?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Should I OFFLINE the drive and then replace it with itself?
Yeah, you can try that.

do I have to follow the encrypted drive replacement procedure even if it is the same drive?
If you're comfortable with the CLI, you can just offline the partition (da3p2 IIRC) and then use it as the replacement. This avoids having to generate new keys for all the disks.
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
If you're comfortable with the CLI, you can just offline the partition (da3p2 IIRC) and then use it as the replacement. This avoids having to generate new keys for all the disks.
I'm a novice but I'm not afraid to try.

zpool offline pool1 /dev/da3p2
zpool replace pool1 /dev/da3p2 /dev/da3p2
zpool online pool1 /dev/da3p2


Are these the commands I need to run? Do I do everything in the CLI or do I have to do some steps in the GUI?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Actually, now that I look at the error message more closely, I'm not sure GELI is too happy with the current state of the drive. It's best to replace it as per the manual and generate new keys for the disks. Be sure to back them up carefully.
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
Actually, now that I look at the error message more closely, I'm not sure GELI is too happy with the current state of the drive. It's best to replace it as per the manual and generate new keys for the disks. Be sure to back them up carefully.
Ok, I'll do that and report back. I'll try rebooting first to see if that changes anything. Just for future reference, were those the commands I was supposed to run as per the first suggestion?

Any ideas on the underlying cause? FW issues? Cable problems? FreeNAS 11.x?

Thank you so much for helping. I'm running on a deadline and this is a real setback.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Just for future reference, were those the commands I was supposed to run as per the first suggestion?
The online isn't necessary. I'd have to double check the replace, since the man page isn't very clear.

Any ideas on the underlying cause?
This seems to be firmly an internal drive issue. It could be bad power, though.
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
So, forgot to try rebooting first, duh! o_O

I went through the replacement procedure, ran a scrub and the short and long SMART tests on all drives. Everything seems ok for the moment. I'm at a loss how to proceed. Did you have a chance to look at links in the post #11 if it could be related?

All smart stats are looking good, except for seek error rate for da1. Anything to worry about?
Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)															
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org														
																																 
=== START OF INFORMATION SECTION ===																							 
Model Family:	 Toshiba 3.5" DT01ACA... Desktop HDD																			 
Device Model:	 TOSHIBA DT01ACA300																							 
Serial Number:	66UAJN1AS																										
LU WWN Device Id: 5 000039 fe3d2e1b4																							 
Firmware Version: MX6OABB0																										
User Capacity:	3,000,592,982,016 bytes [3.00 TB]																				
Sector Sizes:	 512 bytes logical, 4096 bytes physical																		 
Rotation Rate:	7200 rpm																										
Form Factor:	  3.5 inches																									 
Device is:		In smartctl database [for details use: -P show]																 
ATA Version is:   ATA8-ACS T13/1699-D revision 4																				 
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)																		 
Local Time is:	Sun Mar 25 14:38:55 2018 CEST																					
SMART support is: Available - device has SMART capability.																		
SMART support is: Enabled																										 
																																 
=== START OF READ SMART DATA SECTION ===																						 
SMART overall-health self-assessment test result: PASSED																		 
																																 
General SMART Values:																											 
Offline data collection status:  (0x82) Offline data collection activity														 
										was completed without error.															 
										Auto Offline Data Collection: Enabled.													
Self-test execution status:	  (   0) The previous self-test routine completed												 
										without error or no self-test has ever													
										been run.																				 
Total time to complete Offline																									
data collection:				(22508) seconds.																				 
Offline data collection																											
capabilities:					(0x5b) SMART execute Offline immediate.														 
										Auto Offline data collection on/off support.											 
										Suspend Offline collection upon new														
										command.																				 
										Offline surface scan supported.															
										Self-test supported.																	 
										No Conveyance Self-test supported.														
										Selective Self-test supported.															
SMART capabilities:			(0x0003) Saves SMART data before entering														 
										power-saving mode.																		
										Supports SMART auto save timer.															
Error logging capability:		(0x01) Error logging supported.																 
										General Purpose Logging supported.														
Short self-test routine																											
recommended polling time:		(   1) minutes.																				 
Extended self-test routine								
recommended polling time:		( 376) minutes.																				 
SCT capabilities:			  (0x003d) SCT Status supported.																	 
										SCT Error Recovery Control supported.													 
										SCT Feature Control supported.															
										SCT Data Table supported.																 
																																 
SMART Attributes Data Structure revision number: 16																				
Vendor Specific SMART Attributes with Thresholds:																				 
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE								 
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0										 
  2 Throughput_Performance  0x0005   140   140   054	Pre-fail  Offline	  -	   68										 
  3 Spin_Up_Time			0x0007   140   140   024	Pre-fail  Always	   -	   391 (Average 425)						 
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   86										 
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0										 
  7 Seek_Error_Rate		 0x000b   099   099   067	Pre-fail  Always	   -	   65536									 
  8 Seek_Time_Performance   0x0005   124   124   020	Pre-fail  Offline	  -	   33										 
  9 Power_On_Hours		  0x0012   100   100   000	Old_age   Always	   -	   514										
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0										 
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   86										 
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   134										
193 Load_Cycle_Count		0x0012   100   100   000	Old_age   Always	   -	   134										
194 Temperature_Celsius	 0x0002   193   193   000	Old_age   Always	   -	   31 (Min/Max 17/47)						 
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0										 
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0										 
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0										 
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0										 
																																 
SMART Error Log Version: 1																										
No Errors Logged																												 
																																 
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Completed without error	   00%	   512		 -													
# 2  Short offline	   Completed without error	   00%	   506		 -													
# 3  Extended offline	Completed without error	   00%	   239		 -													
# 4  Short offline	   Completed without error	   00%	   233		 -													
# 5  Extended offline	Completed without error	   00%		92		 -													
# 6  Short offline	   Completed without error	   00%		86		 -													
# 7  Short offline	   Completed without error	   00%		39		 -													
																																 
SMART Selective self-test log data structure revision number 1																	
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																					 
	1		0		0  Not_testing																							 
	2		0		0  Not_testing																							 
	3		0		0  Not_testing																							 
	4		0		0  Not_testing																							 
	5		0		0  Not_testing																							 
Selective self-test flags (0x0):																								 
  After scanning selected spans, do NOT read-scan remainder of disk.


EDIT: After a couple of hundred GBs of writes and various SMART tests and scrubs, the drive is now showing 0 errors.
 
Last edited:

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
Doesn't sound good. And it correlates with the number of read errors...
Checked the other drives and all have the same
Code:
0x03  0x030  4			   6  ---  Number of Mechanical Start Failures

I guess it's from the rig build and setup phase and is unrelated.
 

Touche

Explorer
Joined
Nov 26, 2016
Messages
55
This seems to be firmly an internal drive issue. It could be bad power, though.

So, nuked the pool, replugged all the data and power cables, fresh installed 11.1-U4, restored config, created a new pool, sorted everything out, ran tests, scrubs, data hash checks, played with data....and finally moved the server to production. Everything was fine until now and the same error cropped up.

Code:
Mar 27 16:37:12 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 16:37:39 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 16:39:28 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 16:43:19 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 16:43:52 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 16:44:21 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 16:56:17 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 17:02:02 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 17:04:02 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 20:08:28 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 20:08:44 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 27 20:09:39 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 00:00:00 FreeNAS syslog-ng[12365]: Configuration reload request received, reloading configuration;
Mar 28 07:58:22 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 08:57:08 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 09:00:37 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 09:43:18 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 09:49:43 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 09:50:28 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 09:54:46 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 09:54:57 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 11:11:42 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 11:13:06 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 11:29:55 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 11:32:33 FreeNAS smbd: dnssd_clientstub ConnectToServer: connect()-> No of tries: 1
Mar 28 15:13:31 FreeNAS upsd[10072]: mainloop: Interrupted system call
Mar 28 15:13:31 FreeNAS root: /usr/local/etc/rc.d/nut: WARNING: $nut_upsshut is not set properly - see rc.conf(5).
Mar 28 15:13:31 FreeNAS upsmon[10096]: upsmon parent: read
Mar 28 15:14:30 FreeNAS smartd[13168]: Device: /dev/da1 [SAT], failed to read SMART Attribute Data
Mar 28 15:14:30 FreeNAS	(da1:mpr0:0:1:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 560 Aborting command 0xfffffe0000ef1500
Mar 28 15:14:30 FreeNAS mpr0: Sending reset from mprsas_send_abort for target ID 1
Mar 28 15:14:30 FreeNAS	(da1:mpr0:0:1:0): WRITE(10). CDB: 2a 00 0c 62 7a 78 00 00 08 00 length 4096 SMID 1021 terminated ioc 804b loginfo 31130000 scsi 0 state c xfer 0
Mar 28 15:14:30 FreeNAS	(pass1:mpr0:0:1:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 551 terminated ioc 804b loginfo 31130000 scsi 0 state c xfer 0
Mar 28 15:14:30 FreeNAS mpr0: Unfreezing devq for target ID 1
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): WRITE(10). CDB: 2a 00 0c 62 7a 78 00 00 08 00
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): CAM status: CCB request completed with an error
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): Retrying command
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): CAM status: Command timeout
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): Retrying command
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): CAM status: SCSI Status Error
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): SCSI status: Check Condition
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): Error 6, Retries exhausted
Mar 28 15:14:30 FreeNAS (da1:mpr0:0:1:0): Invalidating pack
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_write_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[WRITE(offset=104236056576, length=40960)]
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[READ(offset=270336, length=8192)]
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[READ(offset=2998444761088, length=8192)]
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[READ(offset=2998445023232, length=8192)]
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_write_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[WRITE(offset=104234668032, length=57344)]
Mar 28 15:14:30 FreeNAS ZFS: vdev state changed, pool_guid=1367972315423497889 vdev_guid=6570850287051688413
Mar 28 15:14:30 FreeNAS ZFS: vdev state changed, pool_guid=1367972315423497889 vdev_guid=6570850287051688413
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[READ(offset=270336, length=8192)]
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[READ(offset=2998444761088, length=8192)]
Mar 28 15:14:30 FreeNAS GEOM_ELI: g_eli_read_done() failed (error=6) gptid/77968804-31c0-11e8-80b9-0cc47a868d02.eli[READ(offset=2998445023232, length=8192)]

Intel i3-6100
Samsung 16GB DDR4 2133 ECC X8 2R ( M391A2K43BB1-CPB)
Supermicro X11SSL-CF
Supermicro CBL-SAST-0699 SAS3 Cable SFF-8643 mini SAS HD to 4 x SATA
Data: 6x Toshiba DT01ACA300
Boot: Samsung 830 SSD
Corsair RM750x

Help!
 
Status
Not open for further replies.
Top