VMWare ESXI, TrueNAS Core + Dedupe

NickF

Guru
Joined
Jun 12, 2014
Messages
760
Hi all,

I've been playing around on my test bench and I'm not really sure what I'm doing wrong.
I am using TrueNAS core as a backend storage device for an ESXI host. I installed 2x Samsung 480GB SM953s and set them up as a dedupe VDEV on the pool where the iSCSI target is located. I ZVOL I created for this testing is set to be sparse.
For testing purposes, I setup a VM with thin provisioned storage and cloned it 3 times. The resulting size being reported was ~80GB in VMWare and ~69GB in TrueNAS (due to compression ratio at about 1.35 with LZ4).

I deleted the 3 cloned VMs and turned dedupe on for the ZVOL. I then cloned the VM again 3 times. The resulting storage utilization is the same as with dedupe off. I thought maybe I should have deleted the original VM, because dedupe was off when that was created...but if that were the problem, wouldn't I only be seeing 2 guests' worth of storage, rather than all 4?
What am I doing wrong?

1593143367386.png

1593143374723.png
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
When you deleted VMs after the first test, have you seen space freed on the pool? I am asking to be sure that VMware really used UNMAP on that space and ZFS really freed it. Otherwise the second test would not be valid, since it could overwrite the same space that was used in the first experiment.

And you are right asking about the first data copy. Since it was written with dedup disabled, you should have seen only 2x dedup ratio, not 3x, since the data of the original VM were not added into the dedup table.
 

kspare

Guru
Joined
Feb 19, 2015
Messages
507
Have you tried playing around with the new zfs special dedup vdev?
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Not with dedup yet, only with special (all metadata) ones so far. According to the code I read it should not affect the dedup operation logic, just trying to store the dedup tables on dedicated vdevs, leaking to regular on overflow.
 

kspare

Guru
Joined
Feb 19, 2015
Messages
507
Not with dedup yet, only with special (all metadata) ones so far. According to the code I read it should not affect the dedup operation logic, just trying to store the dedup tables on dedicated vdevs, leaking to regular on overflow.

So the logic stays the same but it uses the special vdev instead of ram to store the dedup tables?
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
So the logic stays the same but it uses the special vdev instead of ram to store the dedup tables?
Not instead of RAM, but instead of much slower HDDs of normal vdevs. RAM requirements stay the same, just initial warmup and may be overflow should be less terrible.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
760
When you deleted VMs after the first test, have you seen space freed on the pool? I am asking to be sure that VMware really used UNMAP on that space and ZFS really freed it. Otherwise the second test would not be valid, since it could overwrite the same space that was used in the first experiment.

And you are right asking about the first data copy. Since it was written with dedup disabled, you should have seen only 2x dedup ratio, not 3x, since the data of the original VM were not added into the dedup table.
Ahhhhh!
So I will delete the VMS again and wait a long while for cleanup from VmWare to actually happen. Thanks!

For anyone who finds this in the future this is how you force it:
esxcli storage vmfs unmap -l MyDatastore

EDIT:
I have done the above and attempted to start over, and I am still having the same issue. I don't seem to be having any deduplication occurring.
When running the above command after deleting 3/4 of the VMS I see the following:
1593272117743.png



I actually just tried again for a third time and turned dedupe on for the entire pool, instead of just the ZVOL, you will notice, no deduplication has occured:
1593272028842.png


I can see that the dedup VDEV is definitely doing stuff:
1593271959161.png

1593272012159.png


I am cloning in vSphere by doing the following:
1593272211541.png


Any further guidance would be appreciated!

Testing is being done in
base-os-12.0-MASTER-202006120424
Supermicro X10-sllf with E3-1225 v3 with 16GB of RAM (which I know is NOT enough for production)

Could the fact that I have such little RAM be preventing the system from doing any dedupe? Obviously the dedupe table spills over from RAM as expected to the dedupe VDEV which is expected, and I would imagine performance isn't going to be great both because my ARC is tiny and it has to wait for the dedupe drive to read back the table. This is just for testing purposes and trying to understand what is going on.
 
Last edited:

NickF

Guru
Joined
Jun 12, 2014
Messages
760
Sorry to double post:

Doing some additional digging... IT IS DEDUPING

Code:
Morty# zfs get all Dimension\ C-137
NAME             PROPERTY              VALUE                  SOURCE
Dimension C-137  type                  filesystem             -
Dimension C-137  creation              Sat Jun 13 21:46 2020  -
Dimension C-137  used                  72.0G                  -
Dimension C-137  available             188G                   -
Dimension C-137  referenced            128K                   -
Dimension C-137  compressratio         1.36x                  -
Dimension C-137  mounted               yes                    -
Dimension C-137  quota                 none                   default
Dimension C-137  reservation           none                   default
Dimension C-137  recordsize            128K                   default
Dimension C-137  mountpoint            /mnt/Dimension C-137   default
Dimension C-137  sharenfs              off                    default
Dimension C-137  checksum              on                     default
Dimension C-137  compression           lz4                    local
Dimension C-137  atime                 on                     default
Dimension C-137  devices               on                     default
Dimension C-137  exec                  on                     default
Dimension C-137  setuid                on                     default
Dimension C-137  readonly              off                    default
Dimension C-137  jailed                off                    default
Dimension C-137  snapdir               hidden                 default
Dimension C-137  aclmode               passthrough            local
Dimension C-137  aclinherit            passthrough            local
Dimension C-137  createtxg             1                      -
Dimension C-137  canmount              on                     default
Dimension C-137  xattr                 on                     default
Dimension C-137  copies                1                      local
Dimension C-137  version               5                      -
Dimension C-137  utf8only              off                    -
Dimension C-137  normalization         none                   -
Dimension C-137  casesensitivity       sensitive              -
Dimension C-137  vscan                 off                    default
Dimension C-137  nbmand                off                    default
Dimension C-137  sharesmb              off                    default
Dimension C-137  refquota              none                   default
Dimension C-137  refreservation        none                   default
Dimension C-137  guid                  9344386157178667861    -
Dimension C-137  primarycache          all                    default
Dimension C-137  secondarycache        all                    default
Dimension C-137  usedbysnapshots       0B                     -
Dimension C-137  usedbydataset         128K                   -
Dimension C-137  usedbychildren        72.0G                  -
Dimension C-137  usedbyrefreservation  0B                     -
Dimension C-137  logbias               latency                default
Dimension C-137  objsetid              54                     -
Dimension C-137  dedup                 on                     local
Dimension C-137  mlslabel              none                   default
Dimension C-137  sync                  standard               default
Dimension C-137  dnodesize             legacy                 default
Dimension C-137  refcompressratio      1.00x                  -
Dimension C-137  written               128K                   -
Dimension C-137  logicalused           82.5G                  -
Dimension C-137  logicalreferenced     42.5K                  -
Dimension C-137  volmode               default                default
Dimension C-137  filesystem_limit      none                   default
Dimension C-137  snapshot_limit        none                   default
Dimension C-137  filesystem_count      none                   default
Dimension C-137  snapshot_count        none                   default
Dimension C-137  snapdev               hidden                 default
Dimension C-137  context               none                   default
Dimension C-137  fscontext             none                   default
Dimension C-137  defcontext            none                   default
Dimension C-137  rootcontext           none                   default
Dimension C-137  relatime              off                    default
Dimension C-137  redundant_metadata    all                    default
Dimension C-137  overlay               on                     default
Dimension C-137  encryption            off                    default
Dimension C-137  keylocation           none                   default
Dimension C-137  keyformat             none                   default
Dimension C-137  pbkdf2iters           0                      default
Dimension C-137  special_small_blocks  0                      default


1593276911314.png


There seems to be something weird going on. If I run the zpool list command, it says I am only allocating 33.4G, with a dedupe ratio of 4.09x but the zfs get all and the GUI are still reporting 72G allocated.

Additionally, it seems to be additively combining the size of the data VDEV and the DEDUPE VDEV. I am not sure if that is expected, but it is confusing if you didn't already know.

Also, it doesn't appear to be a memory issue, according to this the dedupe table is only about 1230B in total.
1593277544461.png



How is a sysadmin supposed to know the actual utilization of a pool if the GUI doesn't expose the real numbers? Just a little confused, is the expected or am I having an issue?
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Dedup works on a pool level (not a dataset) to reach maximal deduplication ratio between different datasets. It probably explains why you don't see its effects in dataset statistics. How would you report deduplicating 3 copies from one dataset and 2 copies from another? And how would you track free space if some those copies were deleted? So I don't think it is really a bug. FreeNAS WebUI could probably better show pool statistics, but unfortunately it is not trivial either, since math there is quite complicated for average unprepared person. Please create a ticket for it to be verified.

"additively combining the size of the data VDEV and the DEDUPE VDEV" -- this is completely normal. Dedup vdev operates exactly the same as any other vdevs in a pool. It just has special allocation policy, allowing only dedup tables to be stored there. So from the point of space accounting it is no different. If there would be no such vdev, or it is overflown, the dedup tables will be stored on normal vdevs, accounted there.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
760
Thank you for taking the time to reply. I appreciate it.


Dedup works on a pool level (not a dataset) to reach maximal deduplication ratio between different datasets. It probably explains why you don't see its effects in dataset statistics. How would you report deduplicating 3 copies from one dataset and 2 copies from another? And how would you track free space if some those copies were deleted? So I don't think it is really a bug.
But a ZVOL != a dataset, and the GUI allows you to turn dedupe on an off at the ZVOL level rather than the entire pool level if you want. I wouldnt expect it to be able to dedupe data stored in one ZVOL or another if only the ZVOLs individually had dedupe turned on, but I would imagine it would do so if you turned it on for the entire pool. Additionally, I'd also imagine if you built a pool in that such a way, where dedupe wasn't enabled globally for the pool, the system would be maintaining separate tables for each place dedupe is turned on. I don't know if that's how it works, or not, just my guess.
FreeNAS WebUI could probably better show pool statistics, but unfortunately it is not trivial either, since math there is quite complicated for average unprepared person. Please create a ticket for it to be verified.
I will definitely create a ticket as you suggest. I think it's quite silly that it isn't reporting actual utilization with dedupe turned on at all. Every other SAN i have used, NetApp, Pure, Hitachi, does this.
"additively combining the size of the data VDEV and the DEDUPE VDEV" -- this is completely normal. Dedup vdev operates exactly the same as any other vdevs in a pool. It just has special allocation policy, allowing only dedup tables to be stored there. So from the point of space accounting it is no different. If there would be no such vdev, or it is overflown, the dedup tables will be stored on normal vdevs, accounted there.

That at least makes sense--thank you for clarifying!


 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
But a ZVOL != a dataset, and the GUI allows you to turn dedupe on an off at the ZVOL level rather than the entire pool level if you want. I wouldnt expect it to be able to dedupe data stored in one ZVOL or another if only the ZVOLs individually had dedupe turned on, but I would imagine it would do so if you turned it on for the entire pool. Additionally, I'd also imagine if you built a pool in that such a way, where dedupe wasn't enabled globally for the pool, the system would be maintaining separate tables for each place dedupe is turned on. I don't know if that's how it works, or not, just my guess.
For ZFS ZVOL is not much different from dataset with only one file. You can control dedup both per dataset and per zvol, but in both cases it only affects whether ZFS will look up and update per pool dedup table for new writes.
 

RegularJoe

Patron
Joined
Aug 19, 2013
Messages
330
For ZFS ZVOL is not much different from dataset with only one file. You can control dedup both per dataset and per zvol, but in both cases it only affects whether ZFS will look up and update per pool dedup table for new writes.

so if we are running NFS is there a way to see what files are de-duplicated?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
so if we are running NFS is there a way to see what files are de-duplicated?

Theoretically yes, you could zdb -ddddd pool/dataset and take the output of the "Indirect blocks" row, then compare to zdb -DDDDD pool and see if any of the addresses line up. A script would probably make this easy to do, but it would be time-consuming to run and scale sharply with the amount of data stored since you're basically checking your entire deduplication table.
 
Top