How Much Frag is too much

Status
Not open for further replies.

Steven Sedory

Explorer
Joined
Apr 7, 2014
Messages
96
Hi,

System info:
Build FreeNAS-9.10.1-U4 (ec9a7d3)
Platform Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Memory 262021MB
14x 4TB NLSAS w/ 800GB P3700 Cache and mirrored 400GB ones for SLOG (overkill I know)
Configed in RAID 1+0 like config

Here's my question: What do I do about fragmentation, and how can I know if its becoming a problem?

And to further ask, is there a way, while running in production, to test the pool's performance? Unfortunately I don't have great metrics to use from when it was spun up 18 months ago, so it would have to be a general performance overview, not necessarily apples to apples.

This question has come up because of some unexplained "resource unavailable" VM crashes (one ore two at a time). This puts the VM in "no boot media available" when it trys to restart, but then when we notice it, be start it up no problem, so clearly whatever caused the issue resolves itself to some degree.

Frag is getting higher and higher. Has been in production for about 18 months. Runs a three node hyper v 2012 r2 cluster via iSCSI with about 50 VMs. Mostly low IO intensive VMs, but a few small Exchange servers and some SQL servers on there too. Defrag doesn't run automatically on any VMs that I'm aware of due to the SSD presentation trick the LUN sends from FreeNAS. Only allow use of about half of the pool via three iSCSI LUNs. One probably would have been fine, but three is fine too.

NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
freenas-boot 29G 698M 28.3G - - 2% 1.00x ONLINE -
vol0 25.4T 6.91T 18.5T - 65% 27% 1.00x ONLINE /mnt

Lastly, bonus question, any suggestions on pool real-time redundancy? Thanks goodness our SAN is alive and healthy, but I've read a lot about SAN cluster solutions and wanted to ask for any input/direction, perhaps that would assist with this specific frag problem as well.

Thanks in advance.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Here's my question: What do I do about fragmentation, and how can I know if its becoming a problem?
There's not much you can do. You can replicate fragmented datasets to new datasets, if there's a lot of free space. You can add space and let ZFS handle things as best as it can.

How can you know it's a problem? Do you see a problem? If not, the answer is no.

And to further ask, is there a way, while running in production, to test the pool's performance?
Not really. Your own workload is the best benchmark.

This question has come up because of some unexplained "resource unavailable" VM crashes (one ore two at a time). This puts the VM in "no boot media available" when it trys to restart, but then when we notice it, be start it up no problem, so clearly whatever caused the issue resolves itself to some degree.
That sounds like a problem.

Defrag doesn't run automatically on any VMs
Good, because that would only make things worse. And the fragmentation number you see is free space fragmentation, not a more traditional percentage of files fragmented.

You probably need to add more disks to reduce pressure on free space and add IOPS.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Some of my Linux root pools get up to 55% fragmented. It's partly lots of small files, and half a dozen boot environments.

One thing to understand about ZFS fragmentation is that the number reflects something different than regular file systems. If I understand it correctly, it's more about pieces of free space than fragmented files.
 

Steven Sedory

Explorer
Joined
Apr 7, 2014
Messages
96
So I just found this post http://www.edugeek.net/forums/windo...hyperv-vms-randomly-reboot-2.html#post1673996

The user said for him that, "The issue turned out to be the HBA Controller not being able to handle to IO traffic which caused the card to reset/resend data."

I'm wondering if the same is happening with us, however I'm not seeing anything in our log that would allude to that. Any way to look deeper into the HBA having this issue?
 
Status
Not open for further replies.
Top