Pool reports 98% used - But I think its not

JohnGoutbeck · May 30, 2023

Version: TrueNAS-13.0-U4

one Pool V2 is
Used Space: 98%

No Snapshpots
Dedup is off on zfs & zpool
Compression is on
no SMB share

Code:

root@kye-nas01:~ # zpool list -v V2
NAME                                             SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
V2                                              54.5T  4.10T  50.4T        -         -     1%     7%  1.00x    ONLINE  /mnt
  raidz1-0                                      54.5T  4.10T  50.4T        -         -     1%  7.52%      -    ONLINE
    gptid/abaa5f8c-9391-11ed-b0a1-a0369f1f8364  18.2T      -      -        -         -      -      -      -    ONLINE
    gptid/ac2cf79e-9391-11ed-b0a1-a0369f1f8364  18.2T      -      -        -         -      -      -      -    ONLINE
    gptid/acacb8ab-9391-11ed-b0a1-a0369f1f8364  18.2T      -      -        -         -      -      -      -    ONLINE
root@kye-nas01:~ # zfs list V2
NAME   USED  AVAIL     REFER  MOUNTPOINT
V2    35.1T   781G      117K  /mnt/V2

root@kye-nas01:~ # zfs list -r -o space V2
NAME       AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
V2          781G  35.1T        0B    117K             0B      35.1T
V2/Z2      33.1T  35.0T        0B   2.73T          32.3T         0B
V2/sbd03    782G  1.02G        0B   95.9K          1.02G         0B
V2/test02   882G   102G        0B   18.9M           102G         0B

root@kye-nas01:~ # zfs list -t snapshot V2
no datasets available

root@kye-nas01:~ # zpool status -D V2
  pool: V2
 state: ONLINE
  scan: scrub repaired 0B in 04:43:33 with 0 errors on Sun Apr 30 04:43:34 2023
config:

        NAME                                            STATE     READ WRITE CKSUM
        V2                                              ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/abaa5f8c-9391-11ed-b0a1-a0369f1f8364  ONLINE       0     0     0
            gptid/ac2cf79e-9391-11ed-b0a1-a0369f1f8364  ONLINE       0     0     0
            gptid/acacb8ab-9391-11ed-b0a1-a0369f1f8364  ONLINE       0     0     0

errors: No known data errors

 dedup: no DDT entries

root@kye-nas01:~ # zpool list -o listsnapshots V2
LISTSNAPS
off
root@kye-nas01:~ # zpool list -o dedupratio V2
DEDUP
1.00x

root@kye-nas01:~ # df -hT | grep V2
V2                                                   zfs        781G    118K    781G     0%    /mnt/V2

root@kye-nas01:~ # ls -l /mnt/
total 16
-rw-r--r--  1 root  wheel  5 May 28 21:40 md_size
drwxr-xr-x  7 root  wheel  7 Jan 23 08:15 V1
drwxr-xr-x  2 root  wheel  2 Jan 13 15:28 V2
root@kye-nas01:~ # ls -l /mnt/V2
total 0


--
from the SLES ISCSI initiator hosts
kye-vmh02:~ # df -hT
Filesystem     Type   Size  Used Avail Use% Mounted on
/dev/sda6      ext3   271G  164G  107G  61% /
udev           tmpfs   63G  504K   63G   1% /dev
tmpfs          tmpfs   63G   76M   63G   1% /dev/shm
/dev/dm-1      ocfs2   35T  4.1T   31T  12% /srv/backup02

----
This TrueNAS is a ISCSI storage target for my SLES hosts for the VMs (virtual disks)

Version: TrueNAS-13.0-U4

Why does the zfs pool is showing 98% full on the dashboard but I have less then 1/2 filled at the host server view?

Is there a way for 'see' the space used and clear it?

sretalla · May 31, 2023

JohnGoutbeck said:
zfs list -t snapshot V2

This command is showing you that you have no snapshots of the V2 pool root dataset, but won't show you if you have any snapshots in the dataset that's holding all the space:

JohnGoutbeck said:
root@kye-nas01:~ # zfs list -r -o space V2

This command clearly shows you that all the space is taken here:
V2/Z2 33.1T 35.0T 0B 2.73T 32.3T 0B

So this command is what you need:
zfs list -t snapshot V2/Z2

NickF · May 31, 2023

JohnGoutbeck said:
Why does the zfs pool is showing 98% full on the dashboard but I have less then 1/2 filled at the host server view?

Is it ISCSI to a VMWare host? If so it's likely because you didn't use thin provisioning when you created the ZVOL.

JohnGoutbeck · May 31, 2023

Thanks for the info

as for snapshots

root@kye-nas01:~ # zfs list -t snapshot V2/Z2
no datasets available

So no snapshots on the pool or dataset
----
As for iSCSI, the hosts are SLES 11 SP4 (old I know but CEO doesn't want to upgrade)
and the hosts mount the ISCSI lun as thick as OCFS2 and the VM VDs are also thick (raw) which are static is size, but lots of r/w from the VMs
but the hosts show only 12% filled
df -hT
/dev/dm-1 ocfs2 35T 4.1T 31T 12% /srv/backup02

---

The zfs is a copy-on-write system. So every time a change is made to a VM VD, zfs does a c-o-w and updates its backend FS.
What does it do with the old section of the c-o-w data? Shouldn't it be placed back in the free space and NOT counted as used space?
- Or am I think wrong?

The zfs pool % used has been going up since it was deployed (from under 50% to now 98%) with no change to the VM VDs size.

I'm trying to figure this out before something bad happens to the pool/dataset/VM-VDs.

Samuel Tai · May 31, 2023

JohnGoutbeck said:
What does it do with the old section of the c-o-w data? Shouldn't it be placed back in the free space and NOT counted as used space?
- Or am I think wrong?

COW blocks no longer in use aren't returned to free space until the iSCSI LUN is dismounted and quiesced, as it's not safe to do so when the zvol is in use.

JohnGoutbeck · May 31, 2023

Thanks Samuel

So you stating that once in a while the must be dismounted & remounted for the old blocks no longer in use are returned to free space.

The system was restarted 3 days ago (TrueNAS & hosts restarted - ISCSI LUN dismounted) and the used space still did not go down. What with this?

When the ISCSI LUN is dismounted - then does zfs quiesced the pool automactically? OR does a command need to be run (command please)?

---
also for the pool status

root@kye-nas01:~ # zpool get all V2
NAME PROPERTY VALUE SOURCE
V2 size 54.5T -
V2 capacity 7% -
V2 altroot /mnt local
V2 health ONLINE -
V2 guid 4210886821470946784 -
V2 version - default
V2 bootfs - default
V2 delegation on default
V2 autoreplace off default
V2 cachefile /data/zfs/zpool.cache local
V2 failmode continue local
V2 listsnapshots off default
V2 autoexpand on local
V2 dedupratio 1.00x -
V2 free 50.4T -
V2 allocated 4.10T -
V2 readonly off -
V2 ashift 0 default
V2 comment - default
V2 expandsize - -
V2 freeing 0 -
V2 fragmentation 1% -
V2 leaked 0 -
V2 multihost off default
V2 checkpoint - -
V2 load_guid 5522006745298701650 -
V2 autotrim off default
V2 compatibility off default

Does this help??
The above states the capacity is 7%
Is this the same as used on the dashboard?
What does 'capacity' mean in the above ?
Then if true, then free should be 93% & the dashboard should say 'Used Space: 7%'
---

JohnGoutbeck · Jun 8, 2023

So, it seems zpool list is displaying all the space is used by the thick dataset
which started out under 50% ( I think at the beginning was under 10%) and is now 98%
and the iscsi device & pool must be dismounted & remounted for the old blocks no longer in use are returned to free space.
- Have to wait for then next time this can be performed & the used space checked

Thanks all - I'll report back with the results when this is accomplished.

NickF · Jun 8, 2023

Understanding ZFS Capacity in TrueNAS: How ZFS Turns Disks into Usable Storage Space

Learn how to accurately calculate ZFS storage capacity with complex factors. Understand ZFS RAID, vdevs, parity, and padding for efficient use.

www.truenas.com

jgreco · Jun 8, 2023

Samuel Tai said:
COW blocks no longer in use aren't returned to free space until the iSCSI LUN is dismounted and quiesced, as it's not safe to do so when the zvol is in use.

Uh, what? Cite? That makes no sense.

JohnGoutbeck said:
raidz1-0 ONLINE 0 0 0 gptid/abaa5f8c-9391-11ed-b0a1-a0369f1f8364 ONLINE 0 0 0 gptid/ac2cf79e-9391-11ed-b0a1-a0369f1f8364 ONLINE 0 0 0 gptid/acacb8ab-9391-11ed-b0a1-a0369f1f8364 ONLINE 0 0 0

I rather suspect this has something to do with it. There's been no discussion of the zvol parameters here. Because you're using RAIDZ1 and a three-wide vdev, it's easy to get into pathological space consumption situations if you do not carefully select all your variables. You might want to refer to post #5 in this thread where I do a deep(ish) dive into the bad effects of RAIDZ combined with block storage.

iSCSI using twice the space...

My system spec's are as follows: Hp Proliant dl385 g7 server 64gigs of ram 24 amd cpu cores at 2.1ghz 32gb lexar flash drive for FreeNAS os LSI hba controller for the drives 8-1tb drives set up in raid-z2 Smart data says the drives are all good 4 gigabit NICs setup in a load balance aggregation...

www.truenas.com

The other thing to bear in mind is that RAIDZ free space computation is generally a guesstimate, not an absolute, because the system doesn't know how to account for the variable data size vs parity issue. If you have a single data sector, RAIDZ1 will consume one sector for parity. If you have two data sectors, RAIDZ1 will consume one sector for parity plus one pad. If you have three, then one parity. Etc. So you really can't calculate space consumption accurately unless you understand the types of data you're going to store.

JohnGoutbeck · Jun 12, 2023

Thanks for the explanation mr grinch

It seems that even though the dashboard used space states 98% but the actual space used from the host POV of 12% , there is room to use.

Had a maintenance window, restarted the NAS and still the pool at 98% - seems no giving back of old blocks.
The NAS was just restarted, iSCSI LUN were not dismounted - does this make a difference?

jgreco · Jun 12, 2023

JohnGoutbeck said:
It seems that even though the dashboard used space states 98% but the actual space used from the host POV of 12% , there is room to use.

The problem here is that the host might think there's only 12% space used, but that's ... skewed.

When you buy a 1TB HDD and attach it to your computer, you're used to thinking that it is 0% full when you first access it, because the filesystem you're using has not stored anything in that space, so both the filesystem and you believe it to be 0% full.

However, from another perspective, the 1TB HDD is 100% full. It has 1953125000 sectors, and all 195312500 have been assigned LBA's. If you request sector (LBA) #10, you will get 512 bytes back. If you then write 512 bytes to sector #10, it will store that, and if you request sector #10 AGAIN, you will get your 512 bytes back. But the capacity of the drive doesn't change from those actions. You can write data to all the LBA's, half the LBA's, ten point five six seven five percent of all the LBA's, and none of this changes the full percentage of the drive.

SAN block storage is the latter model, it is not interested in the opinion of whatever is consuming the data on your host (probably a filesystem) but really only deals with storing the data. If you put a NTFS filesystem on a 1TB ZFS zvol from your host system, it will think that there is 0% used at first, but then if you put 500GB of data on it, the ZFS backend will consume space to store that. It might not be exactly 500GB due to factors such as compression and parity, but space will be consumed. If you then erase the file on the NTFS filesystem, one of two things will happen. One is that if you do not have TRIM or UNMAP support, your zvol will continue to consume the same 500GB even though your NTFS filesystem goes back down to 0% used. The data is still out there on the sectors stored in the zvol. The other thing that can happen is that if you have TRIM or UNMAP support, NTFS notifies the underlying storage device (typically an SSD but in this case ZFS) that the storage is released, and the underlying storage can be freed. In this case, the data is purged from the data stored on the zvol and the data previously stored on those sectors becomes irretrievable, returning zeroes instead.

JohnGoutbeck said:
Had a maintenance window, restarted the NAS and still the pool at 98% - seems no giving back of old blocks.

Right. Just like data on a HDD remains retrievable even though you might think you deleted it in your NTFS filesystem. The hard drive cannot read your intentions and does not understand the structure of NTFS, nor would the drive attempt to unilaterally zero blocks without some mechanism to tell it to do so. That's why SSD's introduced TRIM and SCSI has UNMAP. Your filesystem must signal to the underlying storage that the space is available for reclamation, otherwise how would the underlying storage possibly know that "old blocks" were being "giv[en] back"?

JohnGoutbeck · Jun 14, 2023

Thanks for this explanation

It seems that the TreuNAS zfs and the host OS do not communicate with each other and therefore I cannot expect the zfs space calculation to reflect the host OS space calcs.

But zfs 'knows' about all the real used blocks and old unused blocks - correct?
If so, is there a way to find the amount of real used blocks? And the old unused blocks? And the never/free used blocks?
If so, is there a zfs command to turn the old unused blocks into ever/free used blocks? Thus lowering the dashboard used space?
If not then I'll have to rely on the host OS available space.

How does zfs know what to do with old unused blocks? Does it use them for real data blocks at sometime?

Patrick M. Hausen · Jun 14, 2023

JohnGoutbeck said:
But zfs 'knows' about all the real used blocks and old unused blocks - correct?

JohnGoutbeck said:
How does zfs know what to do with old unused blocks? Does it use them for real data blocks at sometime?

No, and no. Each block that the host OS has written at least once is considered "used". Each time the host OS writes an already used block a second (third, fourth, ...) time, ZFS will write a fresh one and release the old one because of the CoW nature of ZFS. But since ZFS does not know anything about blocks considered "free" by the host OS the number of "used" blocks as seen by ZFS never decreases.

Yes, this means you essentially cannot overprovision or "thin" provision block storage.

jgreco · Jun 14, 2023

JohnGoutbeck said:
But zfs 'knows' about all the real used blocks and old unused blocks - correct?

You again conflate 'knowing' used and unused. Absent some sort of signalling between the host's filesystem and ZFS, ZFS has no way to know if a block is used or unused. This can be communicated via TRIM or UNMAP. If it is not communicated, ZFS has no way to know a block is unused. In such a model, blocks only ever get marked as used.

JohnGoutbeck said:
If so, is there a way to find the amount of real used blocks? And the old unused blocks? And the never/free used blocks?

No. If ZFS doesn't know, about the best the host filesystem can do is write a zero-filled block, which compresses well.

JohnGoutbeck said:
If so, is there a zfs command to turn the old unused blocks into ever/free used blocks? Thus lowering the dashboard used space?

No. Because, again, unless you're notifying via TRIM or UNMAP, ZFS has no way to know a block is unused. Once you've written data to a block, that space must remain allocated. ZFS has to act like a hard drive (or an SSD) when storing block data.

JohnGoutbeck said:
If not then I'll have to rely on the host OS available space.

That's useless. Your actual free space is what ZFS claims it to be. If you are at 98% and you write that last 2%, ZFS will eventually die with an "out of disk" condition where it is unable to write the blocks, even if your host filesystem says you have only 10% used.

JohnGoutbeck said:
How does zfs know what to do with old unused blocks? Does it use them for real data blocks at sometime?

When you overwrite a block, ZFS allocates a new block (elsewhere on your pool), writes the new data there, links up the metadata to include that new block, and then frees your old block -- unless snapshots or something like that are holding it open. That leaves these little spots of free space on the pool that can be used for more new blocks, but it also causes lots of seeking.

JohnGoutbeck · Jul 31, 2023

Hello;

On the TrueNAS I have 2 vdevs

vdev 1
- is mounted on the hosts via iscsi and is a lvm group & disk with format OCFS2
- this OCFS2 volume I can perform a fstrim command on the mounted volume (fstrim -v /srv/backup01) and the TrueNAS used space went down from 83% to 29% - great.

vdev 2
- is mounted on the hosts via iscsi lun disk with format OCFS2 (not a lvm disk)
- this OCFS2 volume I can perform a fstrim command on the mounted volume (fstrim -v /srv/backup02) and the TrueNAS used space DOES NOT go down from the 98% used space.

So the fstrim command does work on some disks/volumes but not on others - depending on the disk controlling app?

Comments??

Important Announcement for the TrueNAS Community.

Pool reports 98% used - But I think its not

JohnGoutbeck

Dabbler

sretalla

Powered by Neutrality

NickF

Guru

JohnGoutbeck

Dabbler

Samuel Tai

Never underestimate your own stupidity

JohnGoutbeck

Dabbler

JohnGoutbeck

Dabbler

NickF

Guru

Understanding ZFS Capacity in TrueNAS: How ZFS Turns Disks into Usable Storage Space

jgreco

Resident Grinch

iSCSI using twice the space...

JohnGoutbeck

Dabbler

jgreco

Resident Grinch

JohnGoutbeck

Dabbler

Patrick M. Hausen

Hall of Famer

jgreco

Resident Grinch

JohnGoutbeck

Dabbler

Similar threads