Changing compression on a live dataset

Status
Not open for further replies.

LeftyAce

Cadet
Joined
Nov 17, 2012
Messages
8
Hi All,

I'm attempting to determine how much compression to run, where is the balance between disk speed, data size and CPU bottlenecking on my particular hardware. Fortunately I have a ton of data which will eventually migrate to this FreeNAS machine and I can use this data as a representative sample for how compressible things will be.

My question is, if I go into View Volumes => Edit ZFS options I can change the compression level. When does this take effect? Is it immediate for new data written to the dataset? Does the dataset then have some data compressed with different compression schemes? Or does it re-compress everything that's already there?

A related question: Is there a way to see how hard the CPU is having to work? While rsyncing data over from another machine, if I run top, SSHD is running 25% of the cpu (this is a 2x dual core machine....so is ssh maxing out 1 core?). Below that is rsync taking about 7% and nothing significant below that... That is with maximum gzip compression turned on (the data is being funneled through what I think is a gigabit switch, but it could be 100mbps).

ps, I copied a folder from an existing machine. On the original (linux) machine du -hs shows 1.4GB. On the freeNas box with lzjb compression it shows 1.6GB and with maximum gzip it's 1.5GB. As far as I know the original filesystem isn't compressed, and presumably even if it were it wouldn't be better than gzip -9 would it? Why is the data getting bigger?

Any thoughts/advice much appreciated!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Ok. I'll attempt to provide you with what I think are the correct answers. Note that I did alot of googling to verify my info, but I could still be work (except for #1 which i'm sure of):

1. When you change the compression level it only applies to new files. Old files(maybe blocks?) are left in their current compressed(or uncompressed) state.
2. gzip appears to be single threaded.
3. You should verify that the network switch you are using is gigabit.
4. Certain types of compression with certain types of data can make the file bigger. That's why its important to know what file type you are using and to do your homework.
 

LeftyAce

Cadet
Joined
Nov 17, 2012
Messages
8
Thanks for the reply noobsauce. I've realized it's probably best to set up multiple datasets to test this, so I can compare and be able to keep track of which folders are compressed with which compression level...

Gzip is indeed single-threaded, but I'd still like a way to see how much CPU the compression is taking. For example, now I'm copying a whole folder from one dataset to another. The first dataset is gzip -9 compressed, the destination is lzjb. I'm using rsync (so I have progress information), and I see two rsync processes each using between 35 and 50% of my cpu. Yesterday when I was running rsync over ssh, sshd was taking up 25% of the cpu. Is FreeNAS rolling the cpu overhead of the filesystem compression into the amount it lists for the process which is accessing the data? Either there isn't a "filesystem uncompress" process, or it's using negligible CPU time...
 
Status
Not open for further replies.
Top