Can ZFS make two files that are the exact same occupy the space of one?

Status
Not open for further replies.

rick6

Dabbler
Joined
Sep 20, 2012
Messages
13
Being a total noob and still in the process of learning, i would like you to explain something i read about ZFS

"FreeNAS ZFS

ZFS, for all it’s sheer awesomeness isn’t perfect. For starters (and this is a FreeNAS thing), only the more recent versions of ZFS have data deduplication. What’s data de-duplication? Well, Windows Home Server has a feature where if you have 2 copies of the exact same file, it’ll only physically store one of them to save space. ZFS goes beyond that, and in addition to that, if you have *parts* of a file that are the same, it’ll only physically store one copy of those parts. All automatically in the background (assuming it’s turned on of course). FreeNAS from what I gather will eventually get this, but it could be a while yet, while other NAS’s like Nexenta and EON already have it."


So basically if i have two folders each having the exact same content, lets say 30.000 files and using 25gb, zfs will do the miracle of using only 25gb of space instead of 50gb and 30.000 physical files on the drive instead of 60.000?

That sounds too good to be true. Please enlighten me :)
 

tingo

Contributor
Joined
Nov 5, 2011
Messages
137
Wikipedia has an article on data deduplication. Start with that.
FWIW, data deduplication on ZFS (as of now) is not an option people with normal size wallets will want. Why? Because it needs extremely powerful hardware, and lots of RAM.
 

rick6

Dabbler
Joined
Sep 20, 2012
Messages
13
My hardware is indeed pretty modest, an Atom 2500 at 1.8ghz and 4gb of ram. I mainly use my freenas for backup of several computers (so it would be possible to have duplicated data in the same freenas harddrive) and occasionally i stream 1080P videos, these are the hardest works i make my freenas box go through.

By enabling deduplication on Freenas would i notice serious degradation in performance?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
You might notice crashing and other problems.

There's no reason that you cannot employ a solution that works within the normal UNIX framework. Look at something like Phil Karn's dupmerge, or just roll your own variant. Phil's put some real work into his latest version of dupmerge though. Of course, you have to be careful about how you use the tools.
 
Status
Not open for further replies.
Top