zpool degraded after zfs set sync=always

Status
Not open for further replies.

Scampicfx

Contributor
Joined
Jul 4, 2016
Messages
125
Dear community,

this is how my zpool looks right now:

Unbenannt.JPG

What this volume does:
- This volume hosts two zvols for ESXi
- ESXi uses iSCSI to mount both zvols
- This is the zpool: san\zvol1, san\zvol2

What I did:
- There was heavy usage on this zpool due to lots of VM activity, when I entered following commands:
Code:
zfs set sync=always san\zvol1
zfs set sync=always san\zvol2


I monitored VMs closely. I couldn't notice a speed loss in write activty (or maybe it was only tiny).

Therefore, I run additional command:
Code:
zfs set sync=always san

(I don't know, if this was a wise decision :/)

All commands were issued within 5 minutes total time. Shortly after last command, FreeNAS began showing lots and lots of tracebacks on the screen (connected via IPMI). The errors flooded my monitor for about 2-3 minutes until FreeNAS initiated a hard reset itself.

Thereafter, the complete vdev mirror-1 was degraded.

Since there was still iSCSI activity, it only took a few moments after reboot for FreeNAS initiating a new error-traceback and flooding my monitor with error messages.

After the second reboot, the complete zpool was degraded.

What's now?
- ESXi is still running. A few VMs have corrupted file systems.
When typing the command
Code:
zpool status -v
, I receive following output:

Code:
  pool: san 
state: DEGRADED 
status: One or more devices has experienced an error resulting in data 
  corruption.  Applications may be affected. 
action: Restore the file in question if possible.  Otherwise restore the 
  entire pool from backup. 
  see: http://illumos.org/msg/ZFS-8000-8A 
  scan: scrub repaired 0 in 17h41m with 0 errors on Sun Oct  8 17:41:56 2017 
config: 
 
  NAME  STATE  READ WRITE CKSUM 
  san  DEGRADED  0  0 8.17K 
  mirror-0  DEGRADED  0  0 2.69K 
  gptid/f1b8a859-4c84-11e7-b22a-0007433aed30  DEGRADED  0  0 2.69K  too many errors 
  gptid/f23bcfe7-4c84-11e7-b22a-0007433aed30  DEGRADED  0  0 2.69K  too many errors 
  mirror-1  DEGRADED  0  0 13.7K 
  gptid/76c630fa-9d43-11e7-b75c-0007433aed30  DEGRADED  0  0 13.7K  too many errors 
  gptid/77562f36-9d43-11e7-b75c-0007433aed30  DEGRADED  0  0 13.7K  too many errors 
 
errors: Permanent errors have been detected in the following files: 
 
  san/tractorunit-data:<0x1>  



My assumption
My first impression is, it was a bad idea to run set sync=always during heavy load activity.
However, I had a chat with a collegue who has lots of experience with FreeNAS.
He told me, that it should not happen that just running this command results in corrupted zpool. He advised that the only thing which might happen is reduced performance - but definetly no data degradation.
He pointed the error towards SAS HBA. In fact, his assumption is that both mirrored vdevs weren't running synchronized anymore which led to this degradation.

I would like to ask if this assumption is actually possible?
Mainboard: Supermicro X10SRH-CLN4F
SAS HBA: Onboard Broadcom 3008
Firmware of SAS HBA: well.. i remember i flashed the firmware to IT mode... however I can't find out which command shows me the firmware version.... i think the firmware version was one number below the driver version... e.g. driver 18, firmware 17... .but i do not remember the exact versions...
FreeNAS version: FreeNAS-9.10.2-U4 (27ae72978)

EDIT: I just found the command sas2flash -listall... So, I used it!

Code:
[root@storageunit ~]# sas2flash -listall										
LSI Corporation SAS2 Flash Utility											 
Version 16.00.00.00 (2013.03.01)												
Copyright (c) 2008-2013 LSI Corporation. All rights reserved					
																				
		No LSI SAS adapters found! Limited Command Set Available!			   
		ERROR: Command Not allowed without an adapter!						 
		ERROR: Couldn't Create Command -listall								 
		Exiting Program.														
[root@storageunit ~]#	


I'm worried about this result?



What else did I do?
- A few weeks ago, I added vdev mirror1 to this zpool. Originally, this zpool only consisted of mirror0. So, now it is (it was? ;)) a striped mirror zpool.
- I added this mirror1 while zpool was being used by ESXi (iSCSI)
- After adding this mirror, following message appeared on my screen:

Code:
May 22 18:53:45 freenas savecore: error reading last dump header at offset 17179865088 in /dev/dumpdev: Invalid argument

(I have to admit that the number 17179865088 could have been a different one; I received this error everytime when doing something with the volume manager)
I filed a bug report: https://bugs.freenas.org/issues/24099

In the bug report, these error were classified as "Filter out useless messages". So I wasn't worried about these errors?


Questions:
- Is it possible that this error was some sort of silent-error which led to this data degradation?
- Is there anything wrong with SAS HBA?
- How is it possible that setting sync=always leads to an entire zpool degradation?
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
EDIT: I just found the command sas2flash -listall... So, I used it!
I believe the Broadcom 3008 is a SAS3 HBA, not SAS2. You'd need sas3flash instead.
 

Scampicfx

Contributor
Joined
Jul 4, 2016
Messages
125
Ahh.. thank you!

Code:
[root@storageunit ~]# sas3flash -listall																							
Avago Technologies SAS3 Flash Utility																							 
Version 10.00.00.01 (2015.06.18)																									
Copyright 2008-2015 Avago Technologies. All rights reserved.																		
																																	
		Adapter Selected is a Avago SAS: SAS3008(C0)																				
																																	
Num   Ctlr			FW Ver		NVDATA		x86-BIOS		 PCI Addr														
----------------------------------------------------------------------------														
																																	
0  SAS3008(C0)  13.00.00.00	0b.02.30.26	08.31.00.00	 00:01:00:00														 
																																	
		Finished Processing Commands Successfully.																				
		Exiting SAS3Flash.																										
[root@storageunit ~]#																											 
							


Well... FW Version 13 should be correct for FreeNAS-9.10.2-U4?
 
Last edited:
Status
Not open for further replies.
Top