ESXi/FreeNAS AIO bottleneck troubleshooting

Status
Not open for further replies.

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
Hi all,

I have an ESXi and FreeNAS all in one box on which I seem to be hitting an odd bottleneck on SSDs that I'm after some help diagnosing;

My ESXi box runs a FreeNAS VM storage on a 960 pro NVMe SSD. I then have a virtual storage network between the FreeNAS VM and ESXi.
The VMXNET adapter on FreeNAS is 10Gigabit, and the MTU is set to 9000.
I have a 4 disk mirrored set of 2TB Hitachi drives on which an iSCSI zvol is stored and attached to ESXi. I store my VMs here.

Responsiveness with sync disabled on this volume is still poor, so I am moving to SSDs instead.
I have now setup 3x Samsung 860 500GB SSDs in RaidZ, disabled sync, and created a new iSCSI volume.
Attached this to ESXi, moved a VM to it.

Testing using crystaldiskmark on a VM on the SSD volume, and one on the HDD volume, yields the same kind of results for a 1GB file, with the HDD volume actually performing better in some cases:
about 600MB/s read, 500MB/s write sequential
550/300 for 512k
15/12 for 4k
130/100 for 4k QD32

Given that sync is disabled, I would expect better results to the SSD volume, so am not sure if there is an issue here?
Given writes of about 400MB/s for each SSD, I would expect more like 800MB/s writes, if not more.
Whilst responsiveness of the VMs is vastly improved, the sequential write speeds still bother me.

Am I getting expected figures (if so, why so low), or is there potentially something holding me back.

If further detailed info on exact hardware and configuration settings is required, please let me know what is needed and I will update the post.

Cheers
Eds
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Whats the record size on the iSCSI zvol? 64k is a good starting point but I think it defaults to 8k.
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
Do you mean the block size?

It defaults to 16k.
Is 64k more appropriate for VM storage?
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Do you mean the block size?

It defaults to 16k.
Is 64k more appropriate for VM storage?
Its what Oracle recommends for zvols storing VM OS drives, from there it depends on the application and its IO profile. But again 64k is a good starting point.

You will need to create a new zvol and move (Storage vMotion or just copy) everything over. you should also look into all of your storage queues as a full queue anywhere in line would cause jittery speeds with low average latency and reduced throughput depending on the IO profile.
https://kb.vmware.com/s/article/1008205
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
I've bumped up to 64k, and there is an improvement to about 750MB/s seq read and 450 for writes.
esxtop doesn't show there to be any significant latency.

Should I bump up to 128k which appears to be the max? Does it also matter that on the iSCSI setup in freenas the block size it advertised as 512?
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
I've bumped up to 64k, and there is an improvement to about 750MB/s seq read and 450 for writes.
esxtop doesn't show there to be any significant latency.
I may have linked the wrong page. Look into monitoring IO queues with esxtop.
Should I bump up to 128k which appears to be the max? Does it also matter that on the iSCSI setup in freenas the block size it advertised as 512?
You could and you may see an improvement in throughput but random IO the kind you see with most workloads outside of streaming would suffer.
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
I will have a look into monitoring IO queues as well.

The VMs themselves don't really run any specific workloads, and are more generalised. For example, I have a pfSense VM, DC, RD Gateway and a couple of other general purpose VMs.
Whilst I do have other volumes storing media which will of course benefit from higher block sizes due to the size of files, I'm guessing general purpose VMs probably want to shoot right down the middle?

That being the case, i will tinker with block sizes a little to see what works best.

What I'd like to know, is what would you imagine the performance figures to be of 3 SSDs in a RAIDZ1, assuming everything is configured to its optimal setting?
Assuming each device has approximately 500MB/s sequential reads and writes?
I'm trying to determine if the volume should be performing better or not.

Is it possibly due to the underlying hardware/6Gb/s link speed of my HBA?
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
I feel like there could be a bus issue somewhere, as I am now trying with the three SSDs in a stripe, with sync disabled, and still only manage around 700MB/s read and 450 writes.

The drives are all connected to a 9207-8i, which will be in a PCIe 8x slot, but the 700 to 750MBs matches the 6Gb/s interface of SATA, so wonder if related?

I will also try to another NVMe drive I have passed through to FreeNAS to see if that is better. If it is, the issue is not FreeNAS or resource starvation, but more likely the SATA bus/HBA?

EDIT:
Think the issue must be within FreeNAS itself, as testing on a Samsung SM953 which is passed through to FreeNAS, the read should be at least 1GB/s if not much higher, with around 800MB/s writes.

On my testing, I am getting about 600/400 sequential.

Could it be MTU related on the virtual storage network? I have specified MTU of 9000 in FreeNAS, not sure if this could be causing a problem?
 
Last edited:

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Could it be MTU related on the virtual storage network? I have specified MTU of 9000 in FreeNAS, not sure if this could be causing a problem?
Did you also set this MTU on the vswitch, port group, and vmkernel?
From Samsung's white paper on the drive: 128KB sequential read 1,760MB.
Have you tried doing a dd read from within FreeNAS? This would help rule out any VM networking issues.
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
Did you also set this MTU on the vswitch, port group, and vmkernel?
From Samsung's white paper on the drive: 128KB sequential read 1,760MB.
Have you tried doing a dd read from within FreeNAS? This would help rule out any VM networking issues.

There isn't an option to set this on a port group, but it has been set on the vswitch and vmkernel.

Can you advise the appropriate dd set of switches to test this?

When I've been doing my tests from within the VM, I don't actually see any read activiy on the drive within the reporting tab. Could that indicate that the reads are not actually being read from the drive but from elsewhere?

EDIT:
I ask about the command as when I do if=/dev/zero bs=2M with a count of 10000, I get 2.5GB/s, which obviously isn't right.
 
Last edited:

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Try this: dd if=/dev/zvol/POOLNAME/ZVOLNAME of=/dev/null bs=128K
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
I let it run for a couple of minutes, and comes back with about 715MB/s reads.
That's obviously below half of what Samsung report the drive to be capable of.

Can I swap if and of to run the same test for writes?

What other things should I monitor for bottlenecks. I assume RAM doesn't come into play here, and that the next most likely thing might be CPU?

EDIT:
Completed more tests as below, and the results don't make sense to me
Code:
dd if=/dev/zvol/nvme/nvme of=/dev/null bs=128K		900MB/s
dd if=/dev/zvol/satassd/satassd of=/dev/null bs=128K	1.6GB/s
dd if=/dev/zero of=/dev/zvol/nvme/nvme bs=128K		1.5GB/s
dd if=/dev/zero of=/dev/zvol/satassd/satassd bs=128K	1.4GB/s

The first result is read from NVME, and is way too low.
The second result is read from a SATA SSD and is way too high.
Third result is write to NVME and is way too high.
Final result is write to SATA SSD and is way too high.

I also see no write activity on those drives under reporting, and read speeds don't match the results of dd. Read speeds only show around 15 to 20MB/s :confused:
 
Last edited:

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Can I swap if and of to run the same test for writes?
Yeah for writing to ZFS, turn off the compression. /dev/zero and you can imagine compresses quite well and is therefore highly skewed.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
The second result is read from a SATA SSD and is way too high.
Perhaps its reading from the ARC? monitor the disks with zpool iostat -v to see it its actually reading from disk. Again, if reading compressed zeros, this may skew the result.
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
Yeah for writing to ZFS, turn off the compression. /dev/zero and you can imagine compresses quite well and is therefore highly skewed.
Ok disabling compression certainly seems to produce more expected results, of about 650-700MB/s writes for NVME, and about 350MB/s for the SATA SSD.

As for reads, is there a way to elimnate ARC from the equation, as there is no L2ARC assigned to these volumes.
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
Writes for both drives now look to be ok, and reads are ok for the SATA SSD.

The NVMe SSD however seems to be hitting 500MB/s read max, so perhaps that is a bus issue (The drive is on a PCIe carrier card, so might be impacting performance).

I'll now repeat tests with the SATA SSDs in another RAID layout, before moving back to the original issue of performance within VMs.
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
More testing, and a 2 SSD stripe is giving me the same read speeds as a single SSD?
The read on each drive drops to 250MB/s. Should I not be expecting close to a GB/s?
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Hmm, I'm not sure. Can you list your full hardware? As for the ARC, a reboot with clear it but as you read from that disk again, it will fill. You could also try testing outside of ZFS to see if its a ZFS quirk or system quirk.
 

Eds89

Contributor
Joined
Sep 16, 2017
Messages
122
Sure:
Xeon E5-2618L V2 @2Ghz, 64GB ECC DDR3 RAM
ESXi 6.7 installed on a USB stick, with a FreeNAS VM stored on a 960 Evo NVMe SSD
FreeNAS assigned 24GB RAM and 4vCPUs
LSI 9207-8i passed through to FreeNAS VM, with 24 port Intel SAS expander attached.
SSDs and Hard drives attached to SAS expander
Also, an SM953 NVMe SSD passed through to FreeNAS
Quad port Intel NIC split amongst VM traffic.
Virtual storage network between FreeNAS VM and ESXi for loopback storage hosting other VMs. (VMXNET3 adaptors).
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Thanks for the details, do you have ANY other VMs running? That's a slow CPU to be sharing 4 cores to one VM. The only time I ever built a VM with more was on a host with dual 10 physical core CPUs at 3GHz+. And that was for a proper database server.
 
Status
Not open for further replies.
Top