Baffling Performance issues with large zfs pool

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
Just want to start by saying I have searched the forum on performance issues for the past few months and they all come back to failing drives. I have tried all the smart tests etc and all the drives seem healthy. So i feel like its time to reach out to see if anyone can help.

The issue at hand... Copying large amounts of files to the system (200GB+), the system will burst extremely fast (900MB/s+) for a few seconds as expected. Then the speeds will settle down to about 250MB/s for about a minute. Then the fun begins. The system will become unresponsive. The drive activity lights will stop flashing, ssh will disconnect, the WebGUI becomes unresponsive, jails hang or die. It will stay hung for upwards of 5 minutes sometimes causing applications to fail along with any file transfers. I have been scratching my head on this as I cannot figure out whats causing it. Just as a frame of reference, when I first got the box and it had windows server 2012R2 installed, I could sustain 250MB/s to it all day without a single hiccup. Now that I enabled JBOD and installed FreeNAS the system is extremely unstable. Looking for help troubleshooting the hang. Searched logs and don't see anything obvious so any help would be appreciated!

System build...

Intel dual SFP+ card configured for LACP to Nexus Core switching

Code:
OS Version:
FreeNAS-11.2-U8

(Build Date: Feb 14, 2020 15:55)

Processor:
Intel(R) Xeon(R) CPU E5607 @ 2.27GHz (8 cores)

Memory:
72 GiB


I have 2 500GB HDDs connected to motherboard Sata and zfs mirrored. Then I have 24 seagate constellations configured as follows... The raidz2 is required for drive loss requirements set by others...

Code:
root@freenas[~]# zpool list
NAME           SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
Data          43.5T  15.1T  28.4T        -         -    24%    34%  1.48x  ONLINE  /mnt
freenas-boot   464G  2.98G   461G        -         -      -     0%  1.00x  ONLINE  -
root@freenas[~]# zpool status
  pool: Data
 state: ONLINE
  scan: scrub repaired 0 in 1 days 04:56:26 with 0 errors on Mon May  4 04:56:36 2020
config:

        NAME                                            STATE     READ WRITE CKSUM
        Data                                            ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/ba5b8d42-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/bb44aabe-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/bc29615c-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/bd218228-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/be12168e-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/bef2d192-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
          raidz2-1                                      ONLINE       0     0     0
            gptid/bff8ace4-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c0e54728-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c1efc57f-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c2e5c00e-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c3d7d3af-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c4c47384-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
          raidz2-2                                      ONLINE       0     0     0
            gptid/c5bb9669-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c6afba1d-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c7a915ef-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c8928c80-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/c994b8c2-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/ca9c8587-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
          raidz2-3                                      ONLINE       0     0     0
            gptid/cba20d8a-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/cc9edb60-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/cd968c08-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/cea7f25b-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/cfa689fd-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0
            gptid/d0a8b450-9abc-11e9-8ddb-001e672bf05c  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:34 with 0 errors on Wed May 13 03:45:35 2020
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p2  ONLINE       0     0     0
            ada1p2  ONLINE       0     0     0

errors: No known data errors



During the hang I cannot check gstat as the system is completely unresponsive. Again, I'm at a loss on how to trouble shoot this further.

Thanks in advance!
 

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
Also just an update - to have the system become stable I have to limit file transfers to ~20MB/s which is insanely slow... the server has 20Gb of fiber to it and i'd like to take advantage of that!
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
By any chance, are you copying using SMB? Do you experience the same hang with another protocol, like NFS or SFTP?
 

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
It is all protocols. I have tried SFTP, SMB, NFS. They all act the same. Thats why i'm thinking it is an underlying storage layer issue. The entire system is unresponsive and before the freeze all system resources are barely being taxed.
 

Tony-1971

Contributor
Joined
Oct 1, 2016
Messages
147
I'm not sure if enabling deduplication with 72 GB of memory is safe.
And also deduplication affect performance.

Best Regards,
Antonio
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
That's your problem. Dedupe requires ~7x the storage to handle its tables. You're maxing out your RAM trying to dedupe 200GB of files.
 

Alecmascot

Guru
Joined
Mar 18, 2014
Messages
1,177

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912

"This means you need to plan to fit your entire deduplication table in memory to avoid major performance and, potentially, data loss. This generally isn’t a problem when first setting up deduplication, but as the table grow over time, you may unexpectedly find its size exceeds memory. "

"The general rule of thumb here is to have 5 GB of memory for every 1TB of deduplicated data. That said, there may be instances where more is required, but you will need to plan to meet the maximum potential memory requirements to avoid problems down the road."

So that seems brittle.

Which Intel SFP+ card, exactly?
 

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
What HBA are you using and how is it configured
LSI 9650SE SATA-II RAID PCIe (rev 01) in JBOD mode


I'm not sure if enabling deduplication with 72 GB of memory is safe.
And also deduplication affect performance.

Best Regards,
Antonio

Can dedup be disabled on the fly to correct the issue and undedup the data? Or am I stuck with it?

"This means you need to plan to fit your entire deduplication table in memory to avoid major performance and, potentially, data loss. This generally isn’t a problem when first setting up deduplication, but as the table grow over time, you may unexpectedly find its size exceeds memory. "

"The general rule of thumb here is to have 5 GB of memory for every 1TB of deduplicated data. That said, there may be instances where more is required, but you will need to plan to meet the maximum potential memory requirements to avoid problems down the road."

So that seems brittle.

Which Intel SFP+ card, exactly?

I remember reading some original Oracle ZFS docs where that number was much lower. So thats a my bad for not reading up on this in FreeBSD/FreeNAS.
Here is the card I am using
Code:
01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
No, dedupe is defined on pool creation. You'll need to recreate your pools.
 

Tony-1971

Contributor
Joined
Oct 1, 2016
Messages
147

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
Ok so i understand I do not have enough ram to fill the entore volume but it seems i have plenty of ram at the moment. Using the FreeBSD and Oracle doc it looks like im only using 14805.00370788574MB or table space which will fit nicely in 72GB or ram. SO is dedup really the issue? How can I prove that dedup is actually the issue here.

Number comes from....
Code:
dedup: DDT entries 87708910, size 1099 on disk, 177 in core
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
If it's not RAM, it's hardware.

That SATA RAID JBOD: https://www.ixsystems.com/community/resources/whats-all-the-noise-about-hbas-and-why-can't-i-use-a-raid-controller.139/

"FreeBSD may or may not have good support for other HBA's/RAID controllers. "

That Intel 10G SFP+ seems to be an X520, so I'm inclined to think it's okay.

Moving to a supported LSI HBA (SAS to SATA) is an easy test, your pools will survive. That's as close to a "must do" as you can come, anyway.

If issues persist and you can rule out RAM, testing via a Chelsio 10G would be the next step, to see whether the X520 is to blame after all.
 

Tony-1971

Contributor
Joined
Oct 1, 2016
Messages
147
If you use @Yorick link the number are the same?
run the ‘zdb -b (pool name)’ command for the desired pool to get an idea of the number of blocks required, then multiply the ‘bp count’ by 320 bytes to get your required memory. If it’s less than 5GB, still use the 5GB per terabyte of storage rule. If it’s higher, go with that number per terabyte.
In your case 5GB for each TB is 5*15 (without using zdb command)
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
Thinking out loud here: Given that there’s lots of room in the pool, could dedupe be turned off after all?
- Turn off dedupe on pool, newly written files won’t use it
- Create new dataset
- rsync -ha data across on SSH session to FreeNAS
- verify it’s all there
- swing shares over
- delete old dataset -> dedupe gone
- repeat performance test

Regardless of result, replace HBA next, or, if moving data can’t be done right away and replacing HBA can, do that first.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
LSI 9650SE SATA-II RAID PCIe (rev 01) in JBOD mode

Take it out, shoot it, burn it, and bury the charred carcass so that it can't pollute your zpool with naughty bits any longer. Replace with a SAS2008-based card at the bare minimum, preferably a SAS2308. ;)

I'm only using 14805MB for DDT's
Code:
dedup: DDT entries 87708910, size 1099 on disk, 177 in core

Problem is that out of your 72GB of RAM, you get (by default) 1/4 of that (18GB) for ZFS metadata and ~15G of that is being used for your DDT. That only leaves 3G for everything else. arc_meta is a soft limit, so it can exceed that, but you'll constantly have pressure back and forth between your primary ARC (data cache) and your metadata ARC as ZFS tries to reduce the arc_meta number. If your DDT or portions of it are on disk, this will result in those brutally slow writes. If this is what you're seeing then I'd consider bumping your arc_meta_limit up with this line in a shell window:

sysctl vfs.zfs.arc_meta_limit=24000000000

Can I see the output of arc_summary.py for this system?
 

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
If you use @Yorick link the number are the same?

In your case 5GB for each TB is 5*15 (without using zdb command)

Excuse the possibly stupid question but... Am i missing somehting lol?
Code:
root@freenas[~]# zpool list 
NAME           SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
Data          43.5T  15.1T  28.4T        -         -    24%    34%  1.48x  ONLINE  /mnt
freenas-boot   464G  2.98G   461G        -         -      -     0%  1.00x  ONLINE  -
root@freenas[~]# zdb -b Data
zdb: can't open 'Data': No such file or directory


Thinking out loud here: Given that there’s lots of room in the pool, could dedupe be turned off after all?
- Turn off dedupe on pool, newly written files won’t use it
- Create new dataset
- rsync -ha data across on SSH session to FreeNAS
- verify it’s all there
- swing shares over
- delete old dataset -> dedupe gone
- repeat performance test

Regardless of result, replace HBA next, or, if moving data can’t be done right away and replacing HBA can, do that first.

Great idea as im sure i have enough space to do that but there really isn't a reason for me to go through that, I have a multitude of real time backups of the shares and jails so i can just delete and move everything back over. Plus I also get the lockups while reading so I'm sure the rsync will fail. I may try that at some point this weekend so i will update this thread as things progress. I do not think the network card is the issue because I can iperf at close to 8Gb/s for an hour and not run into an I/O or the infamous LACP errors that can plague FreeBSD and FreeNAS at times.

I know i may get some eye rolls out of this but i have had issues with LSI RAID cards in the past. Example, had a Cisco C240 m3 loaded with 24x1TB drives and an LSI HBA. Actually had better performance out of the raid 60 than FreeNAS did in an identical ZFS config. (Tested with both firmwares etc) Can't remember the card off the top of my head but it was one a lot of people recommended in the forum. In the end it was only a temp test server for a month and my concern wasn't performance at the time. So I agree that it could 100% be the Raid card that is switched to HBA/JBOD mode. But before I jump and buy any hardware, just wish there was a way to know for sure if it is the current RAM table usage (according to oracle docs its not but I understand we are talking about FreeNAS ZFS and not SUN/Oracle ZFS), the RAID controller misbehaving, or another underlying hardware issue. What stinks is the system is completely unresponsive even on the serial port so there is no easy way that I know of the see whats happening.
 

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
Take it out, shoot it, burn it, and bury the charred carcass so that it can't pollute your zpool with naughty bits any longer. Replace with a SAS2008-based card at the bare minimum, preferably a SAS2308. ;)



Problem is that out of your 72GB of RAM, you get (by default) 1/4 of that (18GB) for ZFS metadata and ~15G of that is being used for your DDT. That only leaves 3G for everything else. arc_meta is a soft limit, so it can exceed that, but you'll constantly have pressure back and forth between your primary ARC (data cache) and your metadata ARC as ZFS tries to reduce the arc_meta number. If your DDT or portions of it are on disk, this will result in those brutally slow writes. If this is what you're seeing then I'd consider bumping your arc_meta_limit up with this line in a shell window:

sysctl vfs.zfs.arc_meta_limit=24000000000

Can I see the output of arc_summary.py for this system?

Sure give me a minute to login and grab the info
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
I know i may get some eye rolls out of this but i have had issues with LSI RAID cards in the past. Example, had a Cisco C240 m3 loaded with 24x1TB drives and an LSI HBA. Actually had better performance out of the raid 60 than FreeNAS did in an identical ZFS config.

Oh, sure. But that c240 wasn't running FreeNAS, was it? The point is that FreeNAS support for those devices ain't great, and ZFS on JBOD ain't great.

For sure, get rid of dedupe first and test.
 

Sanman96

Dabbler
Joined
May 15, 2020
Messages
13
Take it out, shoot it, burn it, and bury the charred carcass so that it can't pollute your zpool with naughty bits any longer. Replace with a SAS2008-based card at the bare minimum, preferably a SAS2308. ;)



Problem is that out of your 72GB of RAM, you get (by default) 1/4 of that (18GB) for ZFS metadata and ~15G of that is being used for your DDT. That only leaves 3G for everything else. arc_meta is a soft limit, so it can exceed that, but you'll constantly have pressure back and forth between your primary ARC (data cache) and your metadata ARC as ZFS tries to reduce the arc_meta number. If your DDT or portions of it are on disk, this will result in those brutally slow writes. If this is what you're seeing then I'd consider bumping your arc_meta_limit up with this line in a shell window:

sysctl vfs.zfs.arc_meta_limit=24000000000

Can I see the output of arc_summary.py for this system?


Ok here is the output.

Code:
root@freenas[~]# arc_summary.py
System Memory:

        0.02%   14.72   MiB Active,     1.87%   1.31    GiB Inact
        94.29%  66.06   GiB Wired,      0.00%   0       Bytes Cache
        3.15%   2.21    GiB Free,       0.67%   482.26  MiB Gap

        Real Installed:                         80.00   GiB
        Real Available:                 89.88%  71.90   GiB
        Real Managed:                   97.45%  70.07   GiB

        Logical Total:                          80.00   GiB
        Logical Used:                   95.60%  76.48   GiB
        Logical Free:                   4.40%   3.52    GiB

Kernel Memory:                                  1.21    GiB
        Data:                           96.57%  1.17    GiB
        Text:                           3.43%   42.54   MiB

Kernel Memory Map:                              70.07   GiB
        Size:                           5.19%   3.64    GiB
        Free:                           94.81%  66.43   GiB
                                                                Page:  1
------------------------------------------------------------------------

ARC Summary: (HEALTHY)
        Storage pool Version:                   5000
        Filesystem Version:                     5
        Memory Throttle Count:                  0

ARC Misc:
        Deleted:                                2.22m
        Mutex Misses:                           951
        Evict Skips:                            951

ARC Size:                               82.25%  56.81   GiB
        Target Size: (Adaptive)         86.84%  59.98   GiB
        Min Size (Hard Limit):          12.50%  8.63    GiB
        Max Size (High Water):          8:1     69.07   GiB

ARC Size Breakdown:
        Recently Used Cache Size:       93.75%  56.23   GiB
        Frequently Used Cache Size:     6.25%   3.75    GiB

ARC Hash Breakdown:
        Elements Max:                           4.42m
        Elements Current:               100.00% 4.42m
        Collisions:                             11.00m
        Chain Max:                              6
        Chains:                                 488.62k
                                                                Page:  2
------------------------------------------------------------------------

ARC Total accesses:                                     739.59m
        Cache Hit Ratio:                98.33%  727.22m
        Cache Miss Ratio:               1.67%   12.37m
        Actual Hit Ratio:               98.13%  725.79m

        Data Demand Efficiency:         95.16%  7.83m
        Data Prefetch Efficiency:       72.06%  1.63m

        CACHE HITS BY CACHE LIST:
          Most Recently Used:           5.48%   39.82m
          Most Frequently Used:         94.33%  685.97m
          Most Recently Used Ghost:     0.32%   2.36m
          Most Frequently Used Ghost:   0.05%   387.45k

        CACHE HITS BY DATA TYPE:
          Demand Data:                  1.02%   7.45m
          Prefetch Data:                0.16%   1.17m
          Demand Metadata:              98.74%  718.05m
          Prefetch Metadata:            0.08%   553.53k

        CACHE MISSES BY DATA TYPE:
          Demand Data:                  3.06%   378.46k
          Prefetch Data:                3.68%   455.48k
          Demand Metadata:              92.56%  11.45m
          Prefetch Metadata:            0.70%   86.00k
                                                                Page:  3
------------------------------------------------------------------------

                                                                Page:  4
------------------------------------------------------------------------

DMU Prefetch Efficiency:                        93.28m
        Hit Ratio:                      4.54%   4.23m
        Miss Ratio:                     95.46%  89.05m

                                                                Page:  5
------------------------------------------------------------------------

                                                                Page:  6
------------------------------------------------------------------------

ZFS Tunable (sysctl):
        kern.maxusers                           4937
        vm.kmem_size                            75234316288
        vm.kmem_size_scale                      1
        vm.kmem_size_min                        0
        vm.kmem_size_max                        1319413950874
        vfs.zfs.vol.immediate_write_sz          32768
        vfs.zfs.vol.unmap_sync_enabled          0
        vfs.zfs.vol.unmap_enabled               1
        vfs.zfs.vol.recursive                   0
        vfs.zfs.vol.mode                        2
        vfs.zfs.sync_pass_rewrite               2
        vfs.zfs.sync_pass_dont_compress         5
        vfs.zfs.sync_pass_deferred_free         2
        vfs.zfs.zio.dva_throttle_enabled        1
        vfs.zfs.zio.exclude_metadata            0
        vfs.zfs.zio.use_uma                     1
        vfs.zfs.zil_slog_bulk                   786432
        vfs.zfs.cache_flush_disable             0
        vfs.zfs.zil_replay_disable              0
        vfs.zfs.version.zpl                     5
        vfs.zfs.version.spa                     5000
        vfs.zfs.version.acl                     1
        vfs.zfs.version.ioctl                   7
        vfs.zfs.debug                           0
        vfs.zfs.super_owner                     0
        vfs.zfs.immediate_write_sz              32768
        vfs.zfs.standard_sm_blksz               131072
        vfs.zfs.dtl_sm_blksz                    4096
        vfs.zfs.min_auto_ashift                 12
        vfs.zfs.max_auto_ashift                 13
        vfs.zfs.vdev.queue_depth_pct            1000
        vfs.zfs.vdev.write_gap_limit            4096
        vfs.zfs.vdev.read_gap_limit             32768
        vfs.zfs.vdev.aggregation_limit_non_rotating131072
        vfs.zfs.vdev.aggregation_limit          1048576
        vfs.zfs.vdev.trim_max_active            64
        vfs.zfs.vdev.trim_min_active            1
        vfs.zfs.vdev.scrub_max_active           2
        vfs.zfs.vdev.scrub_min_active           1
        vfs.zfs.vdev.async_write_max_active     10
        vfs.zfs.vdev.async_write_min_active     1
        vfs.zfs.vdev.async_read_max_active      3
        vfs.zfs.vdev.async_read_min_active      1
        vfs.zfs.vdev.sync_write_max_active      10
        vfs.zfs.vdev.sync_write_min_active      10
        vfs.zfs.vdev.sync_read_max_active       10
        vfs.zfs.vdev.sync_read_min_active       10
        vfs.zfs.vdev.max_active                 1000
        vfs.zfs.vdev.async_write_active_max_dirty_percent60
        vfs.zfs.vdev.async_write_active_min_dirty_percent30
        vfs.zfs.vdev.mirror.non_rotating_seek_inc1
        vfs.zfs.vdev.mirror.non_rotating_inc    0
        vfs.zfs.vdev.mirror.rotating_seek_offset1048576
        vfs.zfs.vdev.mirror.rotating_seek_inc   5
        vfs.zfs.vdev.mirror.rotating_inc        0
        vfs.zfs.vdev.trim_on_init               1
        vfs.zfs.vdev.bio_delete_disable         0
        vfs.zfs.vdev.bio_flush_disable          0
        vfs.zfs.vdev.cache.bshift               16
        vfs.zfs.vdev.cache.size                 0
        vfs.zfs.vdev.cache.max                  16384
        vfs.zfs.vdev.default_ms_shift           29
        vfs.zfs.vdev.min_ms_count               16
        vfs.zfs.vdev.max_ms_count               200
        vfs.zfs.vdev.trim_max_pending           10000
        vfs.zfs.txg.timeout                     5
        vfs.zfs.trim.enabled                    1
        vfs.zfs.trim.max_interval               1
        vfs.zfs.trim.timeout                    30
        vfs.zfs.trim.txg_delay                  32
        vfs.zfs.spa_min_slop                    134217728
        vfs.zfs.spa_slop_shift                  5
        vfs.zfs.spa_asize_inflation             24
        vfs.zfs.deadman_enabled                 1
        vfs.zfs.deadman_checktime_ms            5000
        vfs.zfs.deadman_synctime_ms             1000000
        vfs.zfs.debug_flags                     0
        vfs.zfs.debugflags                      0
        vfs.zfs.recover                         0
        vfs.zfs.spa_load_verify_data            1
        vfs.zfs.spa_load_verify_metadata        1
        vfs.zfs.spa_load_verify_maxinflight     10000
        vfs.zfs.max_missing_tvds_scan           0
        vfs.zfs.max_missing_tvds_cachefile      2
        vfs.zfs.max_missing_tvds                0
        vfs.zfs.spa_load_print_vdev_tree        0
        vfs.zfs.ccw_retry_interval              300
        vfs.zfs.check_hostid                    1
        vfs.zfs.mg_fragmentation_threshold      85
        vfs.zfs.mg_noalloc_threshold            0
        vfs.zfs.condense_pct                    200
        vfs.zfs.metaslab_sm_blksz               4096
        vfs.zfs.metaslab.bias_enabled           1
        vfs.zfs.metaslab.lba_weighting_enabled  1
        vfs.zfs.metaslab.fragmentation_factor_enabled1
        vfs.zfs.metaslab.preload_enabled        1
        vfs.zfs.metaslab.preload_limit          3
        vfs.zfs.metaslab.unload_delay           8
        vfs.zfs.metaslab.load_pct               50
        vfs.zfs.metaslab.min_alloc_size         33554432
        vfs.zfs.metaslab.df_free_pct            4
        vfs.zfs.metaslab.df_alloc_threshold     131072
        vfs.zfs.metaslab.debug_unload           0
        vfs.zfs.metaslab.debug_load             0
        vfs.zfs.metaslab.fragmentation_threshold70
        vfs.zfs.metaslab.force_ganging          16777217
        vfs.zfs.free_bpobj_enabled              1
        vfs.zfs.free_max_blocks                 18446744073709551615
        vfs.zfs.zfs_scan_checkpoint_interval    7200
        vfs.zfs.zfs_scan_legacy                 0
        vfs.zfs.no_scrub_prefetch               0
        vfs.zfs.no_scrub_io                     0
        vfs.zfs.resilver_min_time_ms            3000
        vfs.zfs.free_min_time_ms                1000
        vfs.zfs.scan_min_time_ms                1000
        vfs.zfs.scan_idle                       50
        vfs.zfs.scrub_delay                     4
        vfs.zfs.resilver_delay                  2
        vfs.zfs.top_maxinflight                 32
        vfs.zfs.delay_scale                     500000
        vfs.zfs.delay_min_dirty_percent         60
        vfs.zfs.dirty_data_sync                 67108864
        vfs.zfs.dirty_data_max_percent          10
        vfs.zfs.dirty_data_max_max              4294967296
        vfs.zfs.dirty_data_max                  4294967296
        vfs.zfs.max_recordsize                  1048576
        vfs.zfs.default_ibs                     15
        vfs.zfs.default_bs                      9
        vfs.zfs.zfetch.array_rd_sz              1048576
        vfs.zfs.zfetch.max_idistance            67108864
        vfs.zfs.zfetch.max_distance             8388608
        vfs.zfs.zfetch.min_sec_reap             2
        vfs.zfs.zfetch.max_streams              8
        vfs.zfs.prefetch_disable                0
        vfs.zfs.send_holes_without_birth_time   1
        vfs.zfs.mdcomp_disable                  0
        vfs.zfs.per_txg_dirty_frees_percent     30
        vfs.zfs.nopwrite_enabled                1
        vfs.zfs.dedup.prefetch                  1
        vfs.zfs.dbuf_cache_lowater_pct          10
        vfs.zfs.dbuf_cache_hiwater_pct          10
        vfs.zfs.dbuf_cache_shift                5
        vfs.zfs.dbuf_cache_max_bytes            2317517952
        vfs.zfs.arc_min_prescient_prefetch_ms   6
        vfs.zfs.arc_min_prefetch_ms             1
        vfs.zfs.l2c_only_size                   0
        vfs.zfs.mfu_ghost_data_esize            44765915136
        vfs.zfs.mfu_ghost_metadata_esize        180481536
        vfs.zfs.mfu_ghost_size                  44946396672
        vfs.zfs.mfu_data_esize                  5089809408
        vfs.zfs.mfu_metadata_esize              3698244096
        vfs.zfs.mfu_size                        9171479552
        vfs.zfs.mru_ghost_data_esize            11535908864
        vfs.zfs.mru_ghost_metadata_esize        7032891904
        vfs.zfs.mru_ghost_size                  18568800768
        vfs.zfs.mru_data_esize                  42458968064
        vfs.zfs.mru_metadata_esize              3315952128
        vfs.zfs.mru_size                        50075986944
        vfs.zfs.anon_data_esize                 0
        vfs.zfs.anon_metadata_esize             0
        vfs.zfs.anon_size                       5732352
        vfs.zfs.l2arc_norw                      1
        vfs.zfs.l2arc_feed_again                1
        vfs.zfs.l2arc_noprefetch                1
        vfs.zfs.l2arc_feed_min_ms               200
        vfs.zfs.l2arc_feed_secs                 1
        vfs.zfs.l2arc_headroom                  2
        vfs.zfs.l2arc_write_boost               8388608
        vfs.zfs.l2arc_write_max                 8388608
        vfs.zfs.arc_meta_limit                  18540143616
        vfs.zfs.arc_free_target                 391330
        vfs.zfs.arc_kmem_cache_reap_retry_ms    1000
        vfs.zfs.compressed_arc_enabled          1
        vfs.zfs.arc_grow_retry                  60
        vfs.zfs.arc_shrink_shift                7
        vfs.zfs.arc_average_blocksize           8192
        vfs.zfs.arc_no_grow_shift               5
        vfs.zfs.arc_min                         9270071808
        vfs.zfs.arc_max                         74160574464
        vfs.zfs.abd_chunk_size                  4096
                                                                Page:  7
------------------------------------------------------------------------

root@freenas[~]# 


Your response is exactly what I was looking for as I had no idea that script was there. I haven't had a chance to go over this output since I'm involved in work right now, but if anything sticks out please let me know!


Oh, sure. But that c240 wasn't running FreeNAS, was it? The point is that FreeNAS support for those devices ain't great, and ZFS on JBOD ain't great.

For sure, get rid of dedupe first and test.

Yes the C240 was running ZFS. 16 cores, quad port SFP+ card, and 72GB or RAM. Actually in that test environment for s**** and giggles when i saw the ZFS performance issues i reflashed the hba back to RAID, added the lone vhd to zfs and saw about a 75MB/s speed increase. So i kept it that way (and yes i fully understand that breaks every golden rule of freenas) And the pools were stock, no dedup in that case. I've setup probably 40 FreeNAS servers and only ever had 2 anomalies, the C240 and my current situation. Wish I still had the C240 test bed to test and troubleshoot on but it was just a temp emergency server.
 
Top