Interpreting free space outputs to fix failing snapshots

Status
Not open for further replies.

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
My snapshots are failing allegedly due to lack of disk space:

Code:
Snapshot PER-WORKVOL2/PER-WORKVOL2@auto-20170329.1637-2w failed with the following error: cannot create snapshot 'PER-WORKVOL2/PER-WORKVOL2@auto-20170329.1637-2w': out of space



I am trying to understand what is out of space (dataset, pool, etc?) and how to fix it. I have read documentation and many forum posts, but still find this aspect confusing.


My pool has 10x 6TB HUS726060AL5210 disks in a RAIDZ-2, so should have (10-2)x6TB=48TB usable capacity - layout below. The sole purpose of my FreeNAS box is iSCSI storage for my virtual machines. (Unfortunately I have named my pool and zvol dataset identically, which makes things confusing.)


Code:
# zpool status
  pool: PER-WORKVOL2
state: ONLINE
  scan: scrub repaired 0 in 8h9m with 0 errors on Sun Mar 19 08:09:37 2017
config:

	NAME											STATE	 READ WRITE CKSUM
	PER-WORKVOL2									ONLINE	   0	 0	 0
	  raidz2-0									  ONLINE	   0	 0	 0
		gptid/e64c2232-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/e6cda207-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/e34f7114-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/e73694ca-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/e8437afc-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/e8d23cef-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/e95263d8-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/e9d66038-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/ea5f6d13-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/eaefd10c-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
	logs
	  mirror-1									  ONLINE	   0	 0	 0
		gptid/ebe42886-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
		gptid/ec124b82-ef4a-11e6-8f69-0cc47ad8b846  ONLINE	   0	 0	 0
	cache
	  gptid/ebb23f0d-ef4a-11e6-8f69-0cc47ad8b846	ONLINE	   0	 0	 0
	spares
	  gptid/eb7fba38-ef4a-11e6-8f69-0cc47ad8b846	AVAIL  





How do I interpret this output? What is the pool, what is my zvol?


Code:
# zfs list
NAME															USED  AVAIL  REFER  MOUNTPOINT
PER-WORKVOL2												   39.4T   853G   201K  /mnt/PER-WORKVOL2 
PER-WORKVOL2/.system											413M   853G   400M  legacy
PER-WORKVOL2/.system/configs-eab18b758b91471d95803a91d80bfcda  7.09M   853G  7.09M  legacy
PER-WORKVOL2/.system/cores									  201K   853G   201K  legacy
PER-WORKVOL2/.system/rrd-eab18b758b91471d95803a91d80bfcda	   201K   853G   201K  legacy
PER-WORKVOL2/.system/samba4									 631K   853G   631K  legacy
PER-WORKVOL2/.system/syslog-eab18b758b91471d95803a91d80bfcda   5.77M   853G  5.77M  legacy
PER-WORKVOL2/PER-WORKVOL2									  39.4T  30.9T  9.34T  -



This shows plenty of free space in pool:

Code:
# zpool iostat

				capacity	 operations	bandwidth

pool		  alloc   free   read  write   read  write

------------  -----  -----  -----  -----  -----  -----

PER-WORKVOL2  12.3T  42.2T	 95	124  2.34M  2.49M



“View Volumes” in web config is even more confusing?
Volumes.png


Questions:
  • I assume 1st line refers to pool? If so what does 42.2 TiB available refer to? Is it total capacity after subtracting RAIDZ-2 parity? If so, why 42.2 TiB? 48TB (8x6TB) = 43.7 TiB. Is missing 1.5TiB simply ZFS overheads? Or is this minus snapshot usage? Or do I simply put it down to actual capacity being slightly less than advertised?
  • What on earth does second and third lines refer to? All I have is a single zvol for iSCSI storage.

The 9x existing snapshots take up minimal space, only 862.4 MiB when I counted up the "Used" column:

snapshots.png

Questions:
  • What does "Refer" column mean?
  • If I delete, say the first snapshot, does it free up only the 66.5 MiB, or does it free up 7.1TiB, or does it free up sum of 66.5 Mib + 7.1 TiB?
  • What space does snapshots consume in "View Volumes" screenshot further above? 1st, 2nd or 3rd line?
  • How do I fix my failing snapshots?


My hardware & version info:
  • Build: FreeNAS-9.10.2 (a476f16),
  • Supermicro MBD-X10DRI-O board
  • 8x Samsung 16Gb ECC DDR4 2133 RDIMMs = 128Gb RAM
  • 1x Intel Xeon E5-2620v4 2.1Ghz
  • 1x SAS9207-8i HBA
 

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
So since last post, used space has now decreased from 39.4 TiB to 32.3 TiB (decrease of 7.1 TiB). The only major change is the remaining 9x snapshots were deleted due to the 2-week retention policy. 7.1TiB freed would match snapshots "refer" column amount, but under Fig 8.5.1 Viewing Available Snapshots in manual (http://doc.freenas.org/9.10/storage.html#snapshots) it says:

The amount of space that a dataset consumes from its parent, as well as the amount of space that are freed if this dataset is recursively destroyed, is the greater of its space used and its reservation.


Reservation is none, space used was minimal as per previous post. I assumed the 7.1Tb in the snapshots "refer" column mostly referred to the same blocks that are currently still in use, because I didn't delete or overwrite much data since those snapshots were taken.

  • Question: Why was 7.1 TiB freed? Is it even related to snapshots?


See new usage numbers below:

view-volumes.png



So I have now carefully read the documentation (thanks @dlavigne for the link).

Under “View Volumes” it says:

First line’s Used and Available entries reflect the total size of the pool, including disk parity.

If I add both numbers, I get 54.5 TiB, which roughly equals 60 TB matching my 10x 6TB disks in my RAIDZ-2 vdev, so this makes sense. To be clear: this number apparently excludes spares & disks used for L2ARC & SLOG, which would make sense, because they’re not available for storage. This question now answered, thanks dlavigne.


The second represents the implicit or root dataset and its Used and Available entries indicate the amount of disk space available for storage.

So ignoring compression, I expect total capacity, minus parity, to be 8x6TB=48TB or 43.6TiB. When I add Used (32.3 TiB) + Available (8.0 TiB), I get 40.3 TiB.

Questions:
  • Where is my other 3.3TiB? In fact, with compression, should I not expect even more?
  • Is it necessary to have this “root dataset” or should I have deleted it and simply created a zvol directly on the pool so that the zvol becomes the second line? Is this even possible?

I assume the 3rd line would be my zvol for iSCSI storage. It's saying 32.3TiB used. VMWare is only using 9.86 TB for VMs as shown in screenshot below, so I guess zvols are "thick provisioned", i.e. zvol full capacity is used up in ZFS available storage, correct? Also, only matches if VMWare confuses TiB with TB and total capacity shown in VMWare interface is in fact 32TiB, which wouldn't surprise me.

vmware-capacity.png



I still have these questions:
  • Why are my snapshots still failing now that minimum “Available” figure is 8 TiB?
  • How do I fix my snapshots?
 

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
Can no one help?

Is my periodic snapshot task configured correctly? It did work for the first few weeks:

Screen Shot 2017-04-03 at 09.00.08 .png


Any advice, anyone?
 

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
I have deleted and re-created periodic snapshot task, but problem persists.

zfs list clearly shows pool has 7.97 T available (whatever T means...):

Code:
NAME															USED  AVAIL  REFER  MOUNTPOINT
PER-WORKVOL2												   32.3T  7.97T   201K  /mnt/PER-WORKVOL2
PER-WORKVOL2/PER-WORKVOL2									  32.3T  30.8T  9.39T  -


I have no idea where to go from here.

Disappointing that no one on the forum has any advice for me...
 
Last edited:

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
If I select bottom line (i.e. the zvol itself) and take a manual snapshot it fails with "out of space".

If I select the middle line and take a manual snapshot it succeeds if it is not recursive, but when it is recursive, it also fails.

Log says:
Code:
 manage.py: [middleware.exceptions:37] [MiddlewareError: Snapshot could not be taken: cannot create snapshot 'PER-WORKVOL2/PER-WORKVOL2@manual-20170406': out of space
no snapshots were created


What does this middle line represent? What am I snapshotting when I take a snapshot of this?

How can a snapshot of one thing work, but another doesn't? I thought I snapshots simply consume general storage in the pool as per https://docs.oracle.com/cd/E23824_01/html/821-1448/gbciq.html :
"Snapshots use no separate backing store. Snapshots consume disk space directly from the same storage pool as the file system or volume from which they were created."
Clearly both of these are in the same pool and I only have one pool in any case.


index.php



Any advice anyone please???
 
Last edited:

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
So you have a quote set?

Sent from my Nexus 5X using Tapatalk
 

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
Thank you SweetAndLow. I do not have quotas set, but you pointed me to the answer.

To confirm quotas, I ran the below command:

Code:
zfs list -o name,quota,refquota,reservation,refreservation

NAME														   QUOTA  REFQUOTA  RESERV  REFRESERV
PER-WORKVOL2													none	  none	none	   none
PER-WORKVOL2/PER-WORKVOL2										  -		 -	none	  32.3T


Nothing in QUOTA column, but 32.3T in the REFRESERV column.

According to the zfs man page (https://www.freebsd.org/cgi/man.cgi?query=zfs)
"If refreservation is set, a snapshot is only allowed if there is enough free pool space outside of this reservation to accommodate the current number of "referenced" bytes in the dataset."

zfs list
NAME USED AVAIL REFER MOUNTPOINT
PER-WORKVOL2 32.3T 7.97T 201K /mnt/PER-WORKVOL2
PER-WORKVOL2/PER-WORKVOL2 32.3T 30.8T 9.39T -

free pool space: 7.97TiB < Referenced bytes in the dataset: 9.39TiB.

Can I set refreservation to none? How? Like below, or is this adjustable in the GUI?
Code:
zfs set refreservation=none PER-WORKVOL2/PER-WORKVOL2


What are the risks in setting this to none? I understand I need to monitor space and if I run out I'm screwed. I assume out of space means sum of snapshot usage plus sum of datastore usage approaches available space in pool after subtracting parity and overheads. In absence of refreservation, I assume "USED" column will equal "REFER" column in above output of "zfs list", correct?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
You can remove the reserved settings if you want. No reason to have them really.

Sent from my Nexus 5X using Tapatalk
 

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
Hi Yan Sh, yes, working well now. The zfs set refreservation=none line did the trick.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Hi Yan Sh, yes, working well now. The zfs set refreservation=none line did the trick.
Don't use the cli to mess with things!

Sent from my Nexus 5X using Tapatalk
 

Superman

Dabbler
Joined
Mar 29, 2017
Messages
14
I never set the refreservation in the first place and I can't find any way in the GUI to unset it.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I never set the refreservation in the first place and I can't find any way in the GUI to unset it.
It's in the GUI. Go to the STORAGE tab, highlight the dataset/pool in question, and on the bottom click the properties wrench. You'll see you have a "reserved space for this dataset" and/or a "reserved space for this dataset and all children". That's the refreservation.
 
Status
Not open for further replies.
Top