Has FreeNAS dedupe performance & memory requirement improved?

Chin-Fah HEOH

Dabbler
Joined
Dec 14, 2016
Messages
22
Hello

2 years ago, at the OpenZFS summit Matt Ahrens presented this. Video --> https://www.youtube.com/watch?v=PYxFDBgxFS8. He mentioned that his employer, Delphix, did not implement dedupe but would like interested parties to take up his idea of using logs instead of dedupe hashtables.

I was wondering if any company, or person(s), who has taken up Matt's idea and fix the large memory footprint required by ZFS dedupe. A 5GB per 1TB of dedupe is not very good. Imaging trying to dedupe 300TB usable would mean 1.5TB of RAM!

If you have more info, please share.

Thank you
/CF
 

Chin-Fah HEOH

Dabbler
Joined
Dec 14, 2016
Messages
22
I didn't see much about dedupe at the recent DevSummit.

I got a response from Morgan on Facebook about something coming in FreeNAS 12.0. Not sure if this is the high performant dedupe code donated by Panzura. But I am looking for OpenZFS dedupe efficiency, where it does not have high impact on the RAM.

Anyone else has other technical details to share? Thank you
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
ZFSonLinux took the approach of enabling dedicated dedup vdevs in their 0.8; when OpenZFS (eventually) rebases to ZoL, FreeNAS will inherit it. That doesn't negate the benefits of log-style dedup, but it does provide somewhat of a stopgap until it's implemented and soak-tested.

One thing to note about the log-style dedup is that it would still gobble up memory, but it would just have a ceiling. Running out of memory would effectively "pause" deduplication rather than crush performance; obviously a better option from a stability perspective but it runs a bit counter to ZFS always being "honest" so to speak about what it's doing. I'd want it to be very clear that limiting memory available to deduplication could potentially compromise ratios.
 

Chin-Fah HEOH

Dabbler
Joined
Dec 14, 2016
Messages
22
ZFSonLinux took the approach of enabling dedicated dedup vdevs in their 0.8; when OpenZFS (eventually) rebases to ZoL, FreeNAS will inherit it. That doesn't negate the benefits of log-style dedup, but it does provide somewhat of a stopgap until it's implemented and soak-tested.

One thing to note about the log-style dedup is that it would still gobble up memory, but it would just have a ceiling. Running out of memory would effectively "pause" deduplication rather than crush performance; obviously a better option from a stability perspective but it runs a bit counter to ZFS always being "honest" so to speak about what it's doing. I'd want it to be very clear that limiting memory available to deduplication could potentially compromise ratios.

Definitely. I am not so much concerned of "how" the dedupe will be in implemented. I am more focused on the RAM efficiency (be it hashtables or logs doesn't really matter). Right now that guideline of 5GB per 1TB of dedupe is truly inefficient and a show stopper if we were to use FreeNAS as a secondary storage target.

I will check out the dedicated dedup vdevs and find out more.

Many thanks.
 
Top