"Usable capacity" is wrong. It's not slop.

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
I noticed some size discrepancies that I'd like to figure out regarding ZFS. From what I'm seeing, it's eating terabytes (tebibytes) of disk space.

I have 2 x 2TB drives. That's 1.82 TiB per drive. If I mirror them, I'd expect to get 1.82 TiB of usable capacity:
1702385308231.png


I understand that there's 2GB of swap space reserved, but 1000GB, it's a negligible 0.002TB.

We can verify that with the OpenZFS calculator (https://www.truenas.com/community/resources/openzfs-capacity-calculator.185/):

1702385178681.png


It says because of slop (whatever that is), I should only get 1.76 TiB of usable capacity with about 1.4 TiB being usable if we avoid going over 80% capacity.

But what do I actually see?

1702385345880.png

1.32 TiB of usable capacity. WHAT? That's 73% of the total capacity. What's going on here?

TrueNAS will start emailing me if I get close to 80%, but we're already down 27%! Meaning, at the point we get the 80% warning, we're at only 58% drive capacity.

I saw an article about all this space taken from a zpool including slop space taking up 3.125%. This is by the same guy that made that calculator:

1702385824491.png


Even with the right numbers, none of this adds up.

It's worse with my dRAID zpools. I'm losing ~10 TiB for every 100 TiB (after using the dRAID calculation against my TiB); that's nearly 10%!

Even zpool get size doesn't show them the same:
Code:
# zpool get size
NAME          PROPERTY  VALUE  SOURCE
Bunnies       size      329T   -
TrueNAS-Apps  size      1.36T  -
Wolves        size      511T   -
boot-pool     size      55G    -

All these numbers are off. Wolves is made up of 60 x 9.1 TiB drives (546 TiB), but with this dRAID2:5d:15c:1s config, it should be 364 TiB. I can see losing a few TiB because of rounding errors, but the available space is 332 TiB (a loss of 32 TiB)!

1702386590895.png


TrueNAS will error at 80% usage, but I've already lost 9% in the first place. So that leaves me with only 266 TiB at 80% where I should've had 291 TiB. Still 25 TiB difference not counting the ~10 TiB of slop.

What's going on here? Where'd all my drive capacity go?
 
Last edited:

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
I was going to post looks good on my system. However looks like I also have a 9%ish loss. On the RAIDZ1 pools that consist of 1 VDEV. And on pools consisting of Single disk VDEVs.

Before all hell breaks loose, the single disk VDEVs are by design and I understand and accept the risk as it is ephemeral data.
 

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
I read these two related threads:
  1. https://www.truenas.com/community/threads/zfs-overhead-question.54534/
  2. https://www.truenas.com/community/threads/misaligned-pools-and-lost-space.40288/
Not one person explains this issue. They're talking about how files take up more space, but that has nothing to do with the usable capacity. I'll explain why.

ashift
ashift might affect the total size of a vdev.

I gained 160GiB by changing from 512 Bytes to 4KB on each of my 10TB HDDs. From 8.91TiB to 9.1TiB.

If that's true, wouldn't using an ashift of 12 actually get you more capacity?

Or maybe it's that a 1kB file takes up 4kB. In that case, this would only affect filesize, not usable capacity.

recordsize
recordsize has nothing to do with total capacity. It's related to how much space each file takes up relative to metadata, but the capacity of your zpool doesn't change if you add metadata drives or not.

Also, each dataset, and even each file, can have different recordsizes. You can even change the recordsize after creating a dataset, and it won't change your usable capacity.

Slop space
As the first poster found, he's losing 1TiB unaccounted for by slop space. I'm losing 20TiB and 30TiB in my pools.

It gets worse as your zpool grows, so something's taking away a percentage that no one seems to know about. It's not slop space.

RAID-Z 2^n+p
Sure, RAID-Z has some sort of 2^n+p equation that might affect the total capacity, but so what? The usable capacity is still missing even with a mirror, so RAID-Z has nothing to do with it.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
This topic has been discussed quite a bit. I'm fairly certain there was a solid answer but maybe that was just in my mind.
It says because of slop (whatever that is)
Slop is the unused bits in a block. Example (do not take this completely literally): If you have some data that consumes 515 bits of data and data is written in 512 bit blocks, the location you write the data will hold the first 512 bits, then you have to still write two more bits to equal 515 bits total. Hopefully you are following me so far. These two 512 bits of data now consume 1024 bits in total yet the data is only 515 bits. You cannot use the last 510 bits left over for anything, they become slop. This is just an example to demonstrate what slop is and how it can impact the total capacity. Think about how this can affect 4K bit blocks, it can add up quickly with Advance Format Drives, hence the ashift has a play here.

If you can imagine the physical storage as blocks and how bits are organized, then the amount of blocks needed to store the data, you should be able to visualize what slop is.

Enjoy your conversation about ZFS, Zpool, capacities, and slop.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I'm fairly certain there was a solid answer but maybe that was just in my mind.
Can confirm this, but don't actually recall where.
 

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
This topic has been discussed quite a bit. I'm fairly certain there was a solid answer but maybe that was just in my mind.

Slop is the unused bits in a block. Example (do not take this completely literally): If you have some data that consumes 515 bits of data and data is written in 512 bit blocks, the location you write the data will hold the first 512 bits, then you have to still write two more bits to equal 515 bits total. Hopefully you are following me so far. These two 512 bits of data now consume 1024 bits in total yet the data is only 515 bits. You cannot use the last 510 bits left over for anything, they become slop. This is just an example to demonstrate what slop is and how it can impact the total capacity. Think about how this can affect 4K bit blocks, it can add up quickly with Advance Format Drives, hence the ashift has a play here.

If you can imagine the physical storage as blocks and how bits are organized, then the amount of blocks needed to store the data, you should be able to visualize what slop is.

Enjoy your conversation about ZFS, Zpool, capacities, and slop.
If this explanation of slop is correct, I understand how ashift can negatively affect the number of small files you can write, but it shouldn't affect the usable capacity before you have any files written.

And if it's related to RAID-Z or dRAID, mirrors have the same issue. I'm sure stripes also have the same issue, but I haven't tested.

But then why would stripes and mirrors be affected? You haven't written any data to a mirror, so why would the usable capacity be any lower?

Here's a simple mirror example (says it should be 3.64TiB, the total size of the drive):
1703124316849.png

1703124594277.png


But in actuality, it has less space (96.4%), and then you're supposed to use up to 80% of that smaller number which means you can use only 77.1%:

1703124351808.png

The way you described slop, it's not possible for this number to drop without data. With a RAID-Z, you could calculate it, but mirrors and stripes should be unaffected.

I understand 2GiB are partitioned, but that number is negligiable when we're losing ~23GiB.

I created the same zpool with a single 3.64TiB drive:
1703124624706.png

Same usable capacity as the mirror. What's going on here?
 

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
The new calculator is completely different to the old one, but it includes dRAID values now:

1703125556316.png


1703125562494.png


1703125716273.png

1703125703366.png


What am I missing? Where did my 25TiB go?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
What am I missing?
About 25TiB.

Too soon? I was trying to make a funny.

Let us know when you are as bald as I am and have just accepted the values reported. You are taking the values too literal from the calculators. These are estimates. Also, the 80% rule is to retain a fast system, but once you hit 90%, write optimization goes into effect and it all slows down considerably. It's is easy to go from 80% to 90% so it is nice to have a warning and a 10% buffer to address storage concerns.

Keep searching, hopefully you will find the answer that satisfies your curiosity.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
ZFS keeps reserved a certain amount of space in order to avoid the pool's completely stopping even at 100% capacity. There is also space reserved for the ZIL iirc.
But I would guess it's not as big as 25TiB.


Maybe I found the thread.
 
Last edited:

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
I don't really care about the lost storage space, tbh. When I started with ZFS I knew it would come at a high cost.

I do care that no one seems to be able to give an actual answer to many questions. That there is no proper/usable documentation (as far as I know), that the GUI doesn't inform the user about where the storage actually goes and worst off all that you can do things in the GUI that breaks things which could easily be avoided.

I personally thought it had to do with the way ZFS works and consists of the following:
- I remember ZFS needing quite a bit of space for house keeping. Keeping references to the files, managing versions, hashes and the sorts. Forgive my brief and inaccurate statement here. From what I read this takes up 3%
- A pool in FreeBSD seems to keep 1.5 isch%, Linux 3isch % to make sure you never fill it to 100%. Otherwise bad times
- There is a raidz overhead apparently. I found a few mentions of this. Apperently a 4 disk RAIDZ2 with ashift=12(4k sectors) has a 3.125% overhead, This can be reduced to 0.19% by setting a 1MiB recordsize. I still need to verify this though so take with a grain of salt. Also not sure if that would be taken into account for the usable space, SirMaster in (https://www.reddit.com/r/zfs/comments/4sqp5i/raidz_pool_has_less_space_than_expected/) posted a spreadsheet detailing ZFS overheas (https://docs.google.com/spreadsheet...J-Dc4ZcwUdt6fkCjpnXxAEFlyA/edit#gid=804965548) here is another in the same thread from Matt Ahrens (https://docs.google.com/spreadsheet...jHv6CGVElrPqTA0w_ZY/edit?pli=1#gid=1224630924)

In total we get to around 9% on a Linux system if what I wrote is correct. I'm sorry for providing partials without much proof. I'll see if I can find the time to dig deeper and follow this thread closely as I am quite curious.

Update: Davos has updated his previous post, please see his thread instead of what I wrote. I left the content for prosperity.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Let us know when you are as bald as I am and have just accepted the values reported. You are taking the values too literal from the calculators. These are estimates. Also, the 80% rule is to retain a fast system, but once you hit 90%, write optimization goes into effect and it all slows down considerably. It's is easy to go from 80% to 90% so it is nice to have a warning and a 10% buffer to address storage concerns.
I'm sorry to be the bearer of bad news. Not everyone goes bald... I know as a balding middle aged man I feel your pain.

If you want I have a father in law who has a full head of hair. Beating him up might make us feel better :P
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
My father had a full head of hair, grandfather didn't. Mine is virtually all gone as well.
Every engineer goes bald, those who don't aren't true engineers!!1!11
I wish I wasn't an engineer now :grin:
 

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
ZFS keeps reserved a certain amount of space in order to avoid the pool's completely stopping even at 100% capacity. There is also space reserved for the ZIL iirc.
But I would guess it's not as big as 25TiB.


Maybe I found the thread.
I thought I already responded to those points. It can't be what's in that thread because those things affect filesizes, not capacity.

Post in thread '"Usable capacity" is wrong, and it's not slop' https://www.truenas.com/community/t...-is-wrong-and-its-not-slop.114860/post-797461

I don't really care about the lost storage space, tbh. When I started with ZFS I knew it would come at a high cost.

I do care that no one seems to be able to give an actual answer to many questions. That there is no proper/usable documentation (as far as I know), that the GUI doesn't inform the user about where the storage actually goes and worst off all that you can do things in the GUI that breaks things which could easily be avoided.

I personally thought it had to do with the way ZFS works and consists of the following:
- I remember ZFS needing quite a bit of space for house keeping. Keeping references to the files, managing versions, hashes and the sorts. Forgive my brief and inaccurate statement here. From what I read this takes up 3%
- A pool in FreeBSD seems to keep 1.5 isch%, Linux 3isch % to make sure you never fill it to 100%. Otherwise bad times
- There is a raidz overhead apparently. I found a few mentions of this. Apperently a 4 disk RAIDZ2 with ashift=12(4k sectors) has a 3.125% overhead, This can be reduced to 0.19% by setting a 1MiB recordsize. I still need to verify this though so take with a grain of salt. Also not sure if that would be taken into account for the usable space, SirMaster in (https://www.reddit.com/r/zfs/comments/4sqp5i/raidz_pool_has_less_space_than_expected/) posted a spreadsheet detailing ZFS overheas (https://docs.google.com/spreadsheet...J-Dc4ZcwUdt6fkCjpnXxAEFlyA/edit#gid=804965548) here is another in the same thread from Matt Ahrens (https://docs.google.com/spreadsheet...jHv6CGVElrPqTA0w_ZY/edit?pli=1#gid=1224630924)

In total we get to around 9% on a Linux system if what I wrote is correct. I'm sorry for providing partials without much proof. I'll see if I can find the time to dig deeper and follow this thread closely as I am quite curious.

Update: Davos has updated his previous post, please see his thread instead of what I wrote. I left the content for prosperity.
This is the only explanation which makes more sense, not it's still not enough because, like you said, no source.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
It can't be what's in that thread because those things affect filesizes, not capacity.
I'm not sure about that. I'll see if I have sometime to dive in because I'd like to know as well. No promises though ;)
 
Top