Zvol missing all data when reattached to ESXi

Status
Not open for further replies.

Dleece

Cadet
Joined
Nov 10, 2018
Messages
3
Hello folks,

Is there anyway to look at the contents of a zfs volume from within Freenas? Is there a way to mount a zvol from within freenas, if so can someone post the syntax, the normal Linux "mount -t ..." didn't seem to work, I have been through the zfs man pages a number of times and not figure it out.

I was hoping to mount a certain zvol and see what data, if any is still there, fearing the worst because of an outage.

Environment context:

I have a small Freenas setup for my home network, One raid set hosting with 1 zpool and 4 zvols within that zpool which are shared out to an ESXi server using ISCSI. The ESXI and FreeNas ISCSI network is through a seperate switch and NICs, Jumbo frames enabled on the HP switch, VMware and FreeNas.

I am not 100% sure of the root cause but a few days back the ESXi lost all track of the storage, the VMs are spread out over the 4 zvols. zfs list showed all 4 zvols but restarting ISCSI on FreeNas, rescanning the ISCSI HBA on VMware did not get the volumes reattached.

I eventually removed the targets and recreated them, then a rescan allowed mounting of the zvols. Three of the 4 zvols had data and it was usable -- thank you zfs gods! -- one zvol mounted but is completely empty.

The zpool looks to be healthy:

Code:
[root@freenas] ~# zpool status
  pool: fn545red
 state: ONLINE
  scan: scrub repaired 0 in 3h33m with 0 errors on Sun Nov 11 03:33:26 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		fn545red										ONLINE	   0	 0	 0
		  raidz1-0									  ONLINE	   0	 0	 0
			gptid/d280adcb-c637-11e7-967d-0008a108f357  ONLINE	   0	 0	 0
			gptid/d3288b3d-c637-11e7-967d-0008a108f357  ONLINE	   0	 0	 0
			gptid/d3d9c19e-c637-11e7-967d-0008a108f357  ONLINE	   0	 0	 0

errors: No known data errors


I was also able to snapshot and clone each of the zvols, 3 out of 4 clones and original zvols mounted fine and contained data so I think Freenas is working correctly. Running the "zfs get all" command against each of the zvols looked the same to me but I don't claim to be any sort of a zfs expert.

I have included it below if anyone has the ability to spot an issue that would be great.

Thanks in advance, liking Freenas and ZFS, hoping there is a trick to get the missing data back.

Code:
[root@freenas] ~# zfs get all fn545red/fnmisc
NAME			 PROPERTY			  VALUE				  SOURCE
fn545red/fnmisc  type				  volume				 -
fn545red/fnmisc  creation			  Sat Nov 11 10:01 2017  -
fn545red/fnmisc  used				  961G				   -
fn545red/fnmisc  available			 2.22T				  -
fn545red/fnmisc  referenced			250G				   -
fn545red/fnmisc  compressratio		 1.14x				  -
fn545red/fnmisc  reservation		   none				   default
fn545red/fnmisc  volsize			   700G				   local
fn545red/fnmisc  volblocksize		  16K					-
fn545red/fnmisc  checksum			  on					 default
fn545red/fnmisc  compression		   lz4					inherited from fn545red
fn545red/fnmisc  readonly			  off					default
fn545red/fnmisc  copies				1					  default
fn545red/fnmisc  refreservation		711G				   local
fn545red/fnmisc  primarycache		  all					default
fn545red/fnmisc  secondarycache		all					default
fn545red/fnmisc  usedbysnapshots	   176K				   -
fn545red/fnmisc  usedbydataset		 250G				   -
fn545red/fnmisc  usedbychildren		0					  -
fn545red/fnmisc  usedbyrefreservation  711G				   -
fn545red/fnmisc  logbias			   latency				default
fn545red/fnmisc  dedup				 off					default
fn545red/fnmisc  mlslabel									 -
fn545red/fnmisc  sync				  standard			   default
fn545red/fnmisc  refcompressratio	  1.14x				  -
fn545red/fnmisc  written			   176K				   -
fn545red/fnmisc  logicalused		   269G				   -
fn545red/fnmisc  logicalreferenced	 269G				   -
fn545red/fnmisc  volmode			   default				default
fn545red/fnmisc  snapshot_limit		none				   default
fn545red/fnmisc  snapshot_count		none				   default
fn545red/fnmisc  redundant_metadata	all					default

Code:


hardware specs:
Build FreeNAS-9.10-RELEASE (2def9c8)
Platform Intel(R) Pentium(R) CPU G4400 @ 3.30GHz
Memory 12130MB
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Did you mess with the extents on FreeNAS? If so, the serial number of that "device" would have changed causing ESXi to detect the vmfs file system as a vmfs snapshot. It would either need the serial number reverted or the vmfs volume would need to be resignatured.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
You also need to do some root cause analysis for the luns dropping.
 

Dleece

Cadet
Joined
Nov 10, 2018
Messages
3
Thanks for the response, good question on the extents. I don't beleive I made any changes to them that changed the signatures because ESXi recognized the devices right away, even the disk that is empty. I guess it is backup restore time and more regular snapshots in the future if there is no simple way to mount the zvol and poke around.

I suspect the root cause was a network issue, looks like a cable may have come unplugged slightly. A steady stream of connection drop out messages right about the time the system failed.
Code:
no ping reply (NOP-Out) after 5 seconds; dropping connection
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
if there is no simple way to mount the zvol and poke around
A zvol shows up like any other block device on your system. Its listed in /dev/zvol/<pool_name>/. You will also see the snapshots listed. The issue that you will have is that its formatted as VMFS and not "zfs" or ext2/3 or FAT or any other file system that FreeBSD can read.
I don't beleive I made any changes to them that changed the signatures
only ESXi would change the signatures (outside of serious other issues). Any changes you would have made would only invalidate the signituer for the LUN.
Form your ESXi hosts shell run esxcli storage vmfs snapshot list. If it needs to be resignatured, this is where it will be listed.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
I suspect the root cause was a network issue, looks like a cable may have come unplugged slightly. A steady stream of connection drop out messages right about the time the system failed.
With something like this you would also expect to see a high number of frame or send and receive errors on the switch as well as the network card on the host.
 

Dleece

Cadet
Joined
Nov 10, 2018
Messages
3
Thanks again to kdragon75 for direct responses and others who have posted on similar problems before that also provided input into what I needed to check . I did manage to track down most of the problems.

The "esxcli storage vmfs snapshot list" command identified problems with two luns being unmountable due to duplicate extents. I think it is odd that ESXi only offers you the choice of formatting the disk and you can't tell from the GUI why it can't mount the iscsi LUN.

I had a clone an the original zvol shared out on the same ISCSI target under different names and that confused ESXi. Disabled one of the extents, rescanned the ISCSI HBA and now had the choice to retain the existing signature.

Code:
~ #  esxcfg-volume -l
VMFS UUID/label: 5a0603ac-bc7a3802-d258-00188b4eec33/FNFPSLun2
Can mount: No (duplicate extents found)
Can resignature: No (duplicate extents found)
Extent name: naa.6589cfc000000238507f7e464bd044a9:1	 range: 0 - 511743 (MB)
Extent name: naa.6589cfc0000005415819b7ca2cd53a3a:1	 range: 0 - 511743 (MB)

VMFS UUID/label: 5a061856-47e786fc-7b0c-00188b4eec33/FNFPSLun1
Can mount: No (duplicate extents found)
Can resignature: No (duplicate extents found)
Extent name: naa.6589cfc00000022324ca789de2d46987:1	 range: 0 - 491263 (MB)
Extent name: naa.6589cfc000000760a889d6342275c704:1	 range: 0 - 491263 (MB)
~ #
~ #  ... Removed the zvol clone from Freenas ISCSI target & rescanned ...
~ #
~ #  esxcfg-volume -l
VMFS UUID/label: 5a061856-47e786fc-7b0c-00188b4eec33/FNFPSLun1
Can mount: Yes
Can resignature: Yes
Extent name: naa.6589cfc00000022324ca789de2d46987:1	 range: 0 - 491263 (MB)

VMFS UUID/label: 5a0603ac-bc7a3802-d258-00188b4eec33/FNFPSLun2
Can mount: Yes
Can resignature: Yes
Extent name: naa.6589cfc0000005415819b7ca2cd53a3a:1	 range: 0 - 511743 (MB)
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
I think it is odd that ESXi only offers you the choice of formatting the disk and you can't tell from the GUI why it can't mount the iscsi LUN.
I agree but like with most WebUIs, its only there to make things easy. Unless its FreeNAS. Then there's no built in CLI to configure anything other than networking. That was one of the best things about FreeNAS Corral.:rolleyes:

I'm glad to see it all worked out!:)

EDIT: Don't forget to mark this as solved ;)
 
Status
Not open for further replies.
Top