Can I just turn dedup on to enable dedup on an existing FreeNAS 11.1 zfs data set / pool?

Status
Not open for further replies.

devnullius

Patron
Joined
Dec 9, 2015
Messages
289
Can I just turn dedup on by enabling dedup on an existing data set and pool? Any caveats? All existing data will get deduplicated and this will give better results than my existing level 9 zip support?

My data (for now) mostly consists of incremental Windows Image Backups from a local windows server (it stores the backup locally; I backup that folder, incrementally, to my FreeNAS…). I understand I cannot undo this action (unless re-creating the pool and the data it should contain). I also understand I risk having to add more RAM. And that I might lose performance (probably even!) (https://doc.freenas.org/11/storage.html#deduplication)
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Only new data will be de-duped. This will not save much if any over even basic compression. It's a memory pig and its slow.
 

devnullius

Patron
Joined
Dec 9, 2015
Messages
289
It won't save much? Even if I start over? I can move the data to an empty disk, enable dedup, and start from scratch. But you think that still won't save me a lot of extra room? Appreciated!
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
It won't save much? Even if I start over? I can move the data to an empty disk, enable dedup, and start from scratch. But you think that still won't save me a lot of extra room? Appreciated!
Not worth the hassle. You could setup a separate dataset, enable de-dupe and copy some backups of to see for your self but I imagine It won't be enough to be worth the drawbacks.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
It won't save much? Even if I start over? I can move the data to an empty disk, enable dedup, and start from scratch. But you think that still won't save me a lot of extra room? Appreciated!
I tried what you are suggesting. Move all the data, turn on dedupe, move the data back. It was so slow that I stopped the transfer, turned off dedup and moved on. It might save some space, but it is such a resource hog. The CPU and memory were maxed out and the file transfer was running about a third as fast as normal. If you want to do it, you must have a supremely powerful system with massive amounts of memory.
 

devnullius

Patron
Joined
Dec 9, 2015
Messages
289
Hmmm. Bummer! :) Because the Windows Image changes slightly every day, the whole image file is backed up daily. And I don't trust Windows Backup at all. I have had quite a few times that when I wanted to restore in image, it would be corrupt despite backup status being 100% OK. With a daily incremental backup of these files to FreeNAS I can crawl back in time if one day my windows backup turns out to be corrupted again :)

Maybe I can start using the 10TB disk I have idling around in my FreeNAS to put the backups on... It's a backup of a backup so despite not being redundant, it's something :) With about 1TB / image I should be able to put a lot of backups on it, no? :)
 

devnullius

Patron
Joined
Dec 9, 2015
Messages
289
If you want to do it, you must have a supremely powerful system with massive amounts of memory.
I think we all know the answer to that :(

;-) Ok, I arrive again at the conclusion that dedup is a no-go. Appreciated! I hope it helps future Googlers.

PS: you would consider 2x Intel(R) Xeon(R) CPU X5650 as slow and old, right? :)
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
PS: you would consider 2x Intel(R) Xeon(R) CPU X5650 as slow and old, right?
The system that I tried it on was newer. Look at the config in my signature.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
And I don't trust Windows Backup at all. I have had quite a few times that when I wanted to restore in image, it would be corrupt despite backup status being 100% OK.
Yeah. Windows backup is horrible. Your better off with something free like Veeam Endpoint. You can even set a retention period so you don't endlessly make incrementals and fill your pool.
 

devnullius

Patron
Joined
Dec 9, 2015
Messages
289
Yeah. Windows backup is horrible. Your better off with something free like Veeam Endpoint. You can even set a retention period so you don't endlessly make incrementals and fill your pool.
I don't think I found that one yet - will give it a look! Then see it's very expensive and not available as warez ;p
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
I don't think I found that one yet - will give it a look! Then see it's very expensive and not available as warez ;p
Its 100% Free as in beer.;)
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Application level backups are probably not going to dedupe at all. No guarantees the application is keeping everything block aligned. Even though its the same data, every zfs block (128k by default) is going to be different.

I get good dedupe, but I'm making full vmdk backups from vmware:

Code:
root@nas ~ # zdb -U /data/zfs/zpool.cache -Dv nas-ssd
DDT-sha256-zap-duplicate: 5058848 entries, size 691 on disk, 223 in core
DDT-sha256-zap-unique: 1547768 entries, size 1154 on disk, 372 in core

DDT histogram (aggregated over all DDTs):

bucket			  allocated					   referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
	 1	1.48M	189G	161G	161G	1.48M	189G	161G	161G
	 2	 412K   51.5G   45.8G   45.8G	1013K	127G	113G	113G
	 4	 451K   56.4G   41.8G   41.8G	2.33M	298G	218G	218G
	 8	 492K   61.6G   51.6G   51.6G	5.09M	652G	551G	551G
	16	 498K   62.3G   51.9G   51.9G	10.7M   1.34T   1.10T   1.10T
	32	2.98M	382G	343G	343G	 168M   21.0T   19.0T   19.0T
	64	31.4K   3.92G   2.73G   2.73G	3.36M	430G	306G	306G
   128	1.73K	221M	118M	118M	 323K   40.4G   21.8G   21.8G
   256	  788   98.4M   56.0M   56.0M	 288K   36.0G   20.6G   20.6G
   512	  141   17.6M   10.3M   10.3M	90.6K   11.3G   6.47G   6.47G
	1K	   54   6.75M   2.89M   2.89M	67.7K   8.47G   3.49G   3.49G
	2K	  194   24.2M   15.6M   15.6M	 487K   60.8G   39.1G   39.1G
	4K		1	128K	128K	128K	5.19K	664M	664M	664M
   16K		2	256K	132K	132K	40.8K   5.10G   2.11G   2.11G
 Total	6.30M	806G	698G	698G	 194M   24.2T   21.5T   21.5T

dedup = 31.59, compress = 1.12, copies = 1.00, dedup * compress / copies = 35.50

root@nas ~ # zfs list nas-ssd
NAME	  USED  AVAIL  REFER  MOUNTPOINT
nas-ssd  21.6T   331G	96K  /mnt/nas-ssd


The pool is a non redundant 2 ssd stripe of 1tb ssds. So I've got almost 22T on 2T of storage with 331G free.

28T if you count 'thin' images:

Code:
root@nas ~ # du -Ash /mnt/nas-ssd/
 28T	/mnt/nas-ssd/


I'm purposely keeping the dedupe away from my main pool, as I understand the potential downfalls of enabling dedupe. It's only backup, so I don't mind the non redundant nature of the pool. And ssd's are pretty reliable anyway. I log ssd interesting attributes daily to a text file to monitor writes, etc.
 
Status
Not open for further replies.
Top