No BLOCKLEVEL deduplication ?!

Status
Not open for further replies.

Sergej31231

Dabbler
Joined
Apr 26, 2017
Messages
14
the way its supposed to work, is that individual blocks have checksums. Each block which has the same checksum will only be stored on disk once.

Block sizes in ZFS are I think 128KB -> 1MB depending on dataset/zpool configurations.

Thus duplicating a file shouldn't use any extra space. Also, truncating part of the end of one of those files shouldn't use more than 1 extra block of space, and modifying one block in one of those files, again, shouldn't result in more than 1 extra block of used space.

Of course, used space will go up, but available space will not go down (by much).

Other than that, i have no experience with dedup, and can't tell you how good/bad the dedup ratio numbers are.

And then those blocks get compressed, leading to variable block sizes.


OK I try an experement as follows:

file1 = 17.136MB
upload_2017-5-2_8-5-19.png

Content:
upload_2017-5-2_8-5-57.png


file2 = exact copy of file1 + "abcd" characters
upload_2017-5-2_8-6-46.png


Now you seem to be right :)
 

Sergej31231

Dabbler
Joined
Apr 26, 2017
Messages
14
yes you seem to be absolutely right!!! thank you very much. now i belive that this technology is working
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
There are several scenario for deduplication to take place I think.
One is live deduplication, and for that the data must be buffered before being written to disk, then when deduplication has processed the blocks it will increment the reference to the object being deduplicated.
The second is post process deduplication which will run at a later stage and will do deduplication.
 

Sergej31231

Dabbler
Joined
Apr 26, 2017
Messages
14
But there is an other problem.
I have a big xeon 16 core machine where FreeNAS is running. the installation is on the hardware directly.

when I try to copy a fie big files ~1-2GB on the samba share, the connection breaks. I tried to disable and enable dedup, that seems not to be the problem.
Is there a bug in FreeNAS or in the samba version? Especialy when I start copying concurrently from different machines to the FreeNAS share, the connection goes up and down than finaly breaks

Machine load is ~ 0%
RAM load is low
Tried with dedup on and off (with no success to find the problem)
 
Last edited by a moderator:

Sergej31231

Dabbler
Joined
Apr 26, 2017
Messages
14
There are several scenario for deduplication to take place I think.
One is live deduplication, and for that the data must be buffered before being written to disk, then when deduplication has processed the blocks it will increment the reference to the object being deduplicated.
The second is post process deduplication which will run at a later stage and will do deduplication.

Thank you for your reply

Yes I know. Is it possible to switch zfs to the post process dedup?
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
I don't know.
You will have to search online.
You could search on open-zfs, solaris, oracle.
Maybe there are parameters for zfs and zpool commands to handle the process.
 
Status
Not open for further replies.
Top