NFS reads 67% slower compared to writes

Status
Not open for further replies.

Eagleman

Dabbler
Joined
Jan 31, 2014
Messages
17
My new pool consists of 6 x 512GB Samsung EVO's with an Intel 900P as SLOG:

Code:
  pool: easy
 state: ONLINE
  scan: resilvered 1.90G in 0 days 00:00:05 with 0 errors on Sat Apr 21 12:45:24 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		easy											ONLINE	   0	 0	 0
		  mirror-0									  ONLINE	   0	 0	 0
			gptid/59862696-44e2-11e8-bcdd-001b216cc170  ONLINE	   0	 0	 0
			gptid/59ca03ea-44e2-11e8-bcdd-001b216cc170  ONLINE	   0	 0	 0
		  mirror-1									  ONLINE	   0	 0	 0
			gptid/5a1633e0-44e2-11e8-bcdd-001b216cc170  ONLINE	   0	 0	 0
			gptid/5a6890fb-44e2-11e8-bcdd-001b216cc170  ONLINE	   0	 0	 0
		  mirror-2									  ONLINE	   0	 0	 0
			gptid/5abb3453-44e2-11e8-bcdd-001b216cc170  ONLINE	   0	 0	 0
			gptid/5b07a7c9-44e2-11e8-bcdd-001b216cc170  ONLINE	   0	 0	 0
		logs
		  nvd0p1										ONLINE	   0	 0	 0

errors: No known data errors


Here is the local performance of the pool with sync=disabled set:

FIO writing (local) with 16 IO depth: 2195MB/s
Code:
root@freenas:/mnt/easy/vmware-nfs/test # fio fio-seq-write.job
file1: (g=0): rw=write, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16
fio-3.0
Starting 1 process
file1: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=1624MiB/s][r=0,w=6496 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=56186: Sun Apr 22 20:01:04 2018
  write: IOPS=8372, BW=2093MiB/s (2195MB/s)(123GiB/60001msec)
	clat (usec): min=23, max=536675, avg=112.91, stdev=1121.18
	 lat (usec): min=24, max=536678, avg=117.48, stdev=1122.20
	clat percentiles (usec):
	 |  1.00th=[   34],  5.00th=[   43], 10.00th=[   44], 20.00th=[   48],
	 | 30.00th=[   52], 40.00th=[   60], 50.00th=[   72], 60.00th=[   91],
	 | 70.00th=[  115], 80.00th=[  129], 90.00th=[  172], 95.00th=[  194],
	 | 99.00th=[  537], 99.50th=[  922], 99.90th=[ 3032], 99.95th=[ 5473],
	 | 99.99th=[20317]
   bw (  MiB/s): min=   12, max= 3290, per=99.42%, avg=2080.85, stdev=431.53, samples=119
   iops		: min=   49, max=13163, avg=8322.92, stdev=1726.09, samples=119
  lat (usec)   : 50=26.61%, 100=37.22%, 250=33.28%, 500=1.78%, 750=0.47%
  lat (usec)   : 1000=0.19%
  lat (msec)   : 2=0.27%, 4=0.10%, 10=0.05%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
  cpu		  : usr=4.26%, sys=44.28%, ctx=1285338, majf=0, minf=0
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=0,502330,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2093MiB/s (2195MB/s), 2093MiB/s-2093MiB/s (2195MB/s-2195MB/s), io=123GiB (132GB), run=60001-60001msec


FIO reading (local) with 16 IO depth: 4165MB/s
Code:
root@freenas:/mnt/easy/vmware-nfs/test # fio fio-seq-read.job
file1: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16
fio-3.0
Starting 1 process
file1: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=4046MiB/s,w=0KiB/s][r=16.2k,w=0 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=57281: Sun Apr 22 20:02:15 2018
   read: IOPS=15.9k, BW=3972MiB/s (4165MB/s)(233GiB/60001msec)
	clat (usec): min=52, max=25326, avg=62.48, stdev=28.12
	 lat (usec): min=52, max=25326, avg=62.53, stdev=28.13
	clat percentiles (usec):
	 |  1.00th=[   60],  5.00th=[   61], 10.00th=[   61], 20.00th=[   61],
	 | 30.00th=[   61], 40.00th=[   62], 50.00th=[   62], 60.00th=[   62],
	 | 70.00th=[   62], 80.00th=[   63], 90.00th=[   64], 95.00th=[   65],
	 | 99.00th=[   88], 99.50th=[  113], 99.90th=[  174], 99.95th=[  215],
	 | 99.99th=[  359]
   bw (  MiB/s): min= 2341, max= 4039, per=98.87%, avg=3926.77, stdev=246.27, samples=119
   iops		: min= 9365, max=16158, avg=15706.62, stdev=985.10, samples=119
  lat (usec)   : 100=99.27%, 250=0.70%, 500=0.03%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 50=0.01%
  cpu		  : usr=1.04%, sys=98.66%, ctx=12352, majf=0, minf=64
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=953221,0,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=3972MiB/s (4165MB/s), 3972MiB/s-3972MiB/s (4165MB/s-4165MB/s), io=233GiB (250GB), run=60001-60001msec


This would be everything you expect from this pool. Even tho the 10GB file I used is a bit small for performance testing and some of the reads might come out the ARC.

Now for the network part:
Mounted the pool using NFS to my 2 ESXi machines. All my machines (3) are equiped with Intel X520-DA2 cards (10Gbps) and Intel AFBR-703SDZ-IN2 adapters equiped with LC to LC - OM3 fibre cables. The cards are directly connected, I am not using any switch.
I do not have any tunables set on FreeNAS and I increased the ammount of threads on my NFS config to 32. Did some further testing with increasing threads on ESXi or FreeNAS but this didnt seem to have any effect. So I just left the 32 threads on the FN NFS config stay.
ESXi machine I am using for testing is also untouched for NFS specific tunables.

Made a 50GB VMDK disk on the NFS share and attached it to a CentOS 7 machine. Nothing else is running on the NFS share, the CentOS is on local storage on ESXi, so its just the disk I am writing/reading to. The disk is formatted with EXT4.

FIO writing (NFS) with 16 IO depth: 1141MB/s
Code:
[root@core benchmark]# fio fio-seq-write.job
file1: (g=0): rw=write, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16
fio-3.1
Starting 1 process
file1: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=1082MiB/s][r=0,w=4329 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=3106: Sun Apr 22 19:57:18 2018
  write: IOPS=4353, BW=1088MiB/s (1141MB/s)(63.8GiB/60001msec)
	clat (usec): min=79, max=93065, avg=226.05, stdev=1364.14
	 lat (usec): min=80, max=93068, avg=228.87, stdev=1364.17
	clat percentiles (usec):
	 |  1.00th=[   83],  5.00th=[   85], 10.00th=[   87], 20.00th=[   90],
	 | 30.00th=[   93], 40.00th=[   96], 50.00th=[   99], 60.00th=[  104],
	 | 70.00th=[  110], 80.00th=[  116], 90.00th=[  124], 95.00th=[  135],
	 | 99.00th=[ 1090], 99.50th=[13829], 99.90th=[16450], 99.95th=[16909],
	 | 99.99th=[18220]
   bw (  MiB/s): min=  866, max= 1878, per=99.88%, avg=1087.08, stdev=85.35, samples=120
   iops		: min= 3466, max= 7514, avg=4348.21, stdev=341.40, samples=120
  lat (usec)   : 100=51.57%, 250=47.05%, 500=0.10%, 750=0.23%, 1000=0.06%
  lat (msec)   : 2=0.08%, 4=0.01%, 10=0.22%, 20=0.68%, 50=0.01%
  lat (msec)   : 100=0.01%
  cpu		  : usr=1.64%, sys=45.76%, ctx=3246, majf=0, minf=33
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=0,261215,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=1088MiB/s (1141MB/s), 1088MiB/s-1088MiB/s (1141MB/s-1141MB/s), io=63.8GiB (68.5GB), run=60001-60001msec

Disk stats (read/write):
  sdc: ios=0/129339, merge=0/48060, ticks=0/8463868, in_queue=8467668, util=99.80%


So far so good, this is almost the limit (1250MB/s) of the 10Gbps card.

FIO reading (NFS) with 16 IO depth: 376MB/s
Code:
[root@core benchmark]# fio fio-seq-read.job
file1: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16
fio-3.1
Starting 1 process
file1: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=358MiB/s,w=0KiB/s][r=1433,w=0 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=3112: Sun Apr 22 19:58:43 2018
   read: IOPS=1434, BW=359MiB/s (376MB/s)(21.0GiB/60001msec)
	clat (usec): min=403, max=4792, avg=695.46, stdev=129.56
	 lat (usec): min=403, max=4793, avg=695.63, stdev=129.57
	clat percentiles (usec):
	 |  1.00th=[  502],  5.00th=[  529], 10.00th=[  529], 20.00th=[  545],
	 | 30.00th=[  586], 40.00th=[  660], 50.00th=[  709], 60.00th=[  742],
	 | 70.00th=[  775], 80.00th=[  816], 90.00th=[  857], 95.00th=[  889],
	 | 99.00th=[  979], 99.50th=[ 1004], 99.90th=[ 1074], 99.95th=[ 1090],
	 | 99.99th=[ 1631]
   bw (  KiB/s): min=343552, max=446976, per=100.00%, avg=367395.09, stdev=20000.71, samples=120
   iops		: min= 1342, max= 1746, avg=1435.03, stdev=78.18, samples=120
  lat (usec)   : 500=0.98%, 750=61.49%, 1000=36.95%
  lat (msec)   : 2=0.57%, 4=0.01%, 10=0.01%
  cpu		  : usr=0.41%, sys=3.17%, ctx=86093, majf=0, minf=97
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=86091,0,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=359MiB/s (376MB/s), 359MiB/s-359MiB/s (376MB/s-376MB/s), io=21.0GiB (22.6GB), run=60001-60001msec

Disk stats (read/write):
  sdc: ios=86090/15, merge=0/78, ticks=58377/3, in_queue=58324, util=97.07%


Now this is where the problem starts, I am able to reach 4165MB/s on the local benchmark but only 376MB/s on the NFS benchmark. This doesnt make any sense because I am able to write with 1141MB/s to the NFS datastore.

So I did some further digging, i ran some Iperf benchmarks to and from FreeNAS:

FreeNAS > ESXi (5.33 Gbits/sec)
Code:
--------------
Client connecting to 192.168.22.5, TCP port 5001
TCP window size: 32.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.22.10 port 60859 connected with 192.168.22.5 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  6.20 GBytes  5.33 Gbits/sec



ESXi > FreeNAS (9.38 Gbits/sec)
Code:
--------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  4] local 192.168.22.10 port 5001 connected with 192.168.22.5 port 47769
[ ID] Interval	   Transfer	 Bandwidth
[  4]  0.0-10.0 sec  10.9 GBytes  9.38 Gbits/sec


Now this is where it gets weird, why I am only able to reach 9.38Gbits/sec when connecting from ESXi (client) to FreeNAS (server), and only 5.33Gbits/sec when connecting from FreeNAS (client) to ESXi (server). Not sure what causes this, but maybe the ESXi enviroment isnt strong enough to run an iPerf server?

What else have I tried?

I mounted the NFS share inside the same VM and got similar performance from both reading and writing to the NFS share.

Changed the following tunables inside ESXi:
NFS.MaxQueueDepth to different values
Increased rx ring buffer to 4096
Net.TcpipHeapMax to 1536
Net.TcpipHeapSize to 32

None of these options actually increased performance.


The only thing I can still do is test the performance to a bare metal CentOS machine, maybe with this way I can finally find out where the bottleneck is, either somewhere in FreeNAS or ESXi. But it still doesn't make sense why my read speeds can be so much lower than my write speeds.

If anyone else has a suggestion on what to check it would be very welcome!
 

zambanini

Patron
Joined
Sep 11, 2013
Messages
479
that is ok. you need to learn about arc und the zfs filesystem
 

Eagleman

Dabbler
Joined
Jan 31, 2014
Messages
17
that is ok. you need to learn about arc und the zfs filesystem

So the ARC and ZFS filesystem should make my performance 800MB/s worse over NFS?

Here is a 100GB test file reading for 5 minutes, 100GB is enough to blow my ARC out and give me numbers more closer of what the disks are capable of:

FIO reading (local) with 16 IO depth: 2653MB/s
Code:
root@freenas:/mnt/easy/vmware-nfs/test # fio fio-seq-read.job
file1: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16
fio-3.0
Starting 1 process
file1: Laying out IO file (1 file / 102400MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=2599MiB/s,w=0KiB/s][r=10.4k,w=0 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=56872: Sun Apr 22 21:37:56 2018
   read: IOPS=10.1k, BW=2530MiB/s (2653MB/s)(741GiB/300001msec)
	clat (usec): min=42, max=56361, avg=98.07, stdev=148.09
	 lat (usec): min=42, max=56361, avg=98.14, stdev=148.09
	clat percentiles (usec):
	 |  1.00th=[   55],  5.00th=[   59], 10.00th=[   62], 20.00th=[   67],
	 | 30.00th=[   71], 40.00th=[   73], 50.00th=[   76], 60.00th=[   78],
	 | 70.00th=[   82], 80.00th=[   88], 90.00th=[  104], 95.00th=[  145],
	 | 99.00th=[  758], 99.50th=[ 1037], 99.90th=[ 1729], 99.95th=[ 2024],
	 | 99.99th=[ 3687]
   bw (  MiB/s): min=  213, max= 2759, per=99.68%, avg=2521.80, stdev=225.53, samples=599
   iops		: min=  852, max=11039, avg=10086.75, stdev=902.16, samples=599
  lat (usec)   : 50=0.01%, 100=88.59%, 250=8.21%, 500=1.37%, 750=0.79%
  lat (usec)   : 1000=0.48%
  lat (msec)   : 2=0.50%, 4=0.05%, 10=0.01%, 20=0.01%, 100=0.01%
  cpu		  : usr=0.98%, sys=76.28%, ctx=785076, majf=0, minf=64
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=3035890,0,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=2530MiB/s (2653MB/s), 2530MiB/s-2530MiB/s (2653MB/s-2653MB/s), io=741GiB (796GB), run=300001-300001msec
 

Eagleman

Dabbler
Joined
Jan 31, 2014
Messages
17
Here is the benchmark from FreeNAS to real hardware (CentOS installed on an SSD)

FIO writing (NFS) with 16 IO depth: 2034MB/s
Code:
[root@localhost test]# fio fio-seq-write.job
file1: (g=0): rw=write, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16
fio-3.1
Starting 1 process
file1: Laying out IO file (1 file / 10240MiB)
fio: native_fallocate call failed: Operation not supported
Jobs: 1 (f=1): [f(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=12566: Sun Apr 22 21:57:15 2018
  write: IOPS=7758, BW=1940MiB/s (2034MB/s)(114GiB/60001msec)
	clat (usec): min=59, max=9040.7k, avg=125.93, stdev=16307.89
	 lat (usec): min=61, max=9040.7k, avg=128.38, stdev=16307.89
	clat percentiles (usec):
	 |  1.00th=[   61],  5.00th=[   61], 10.00th=[   62], 20.00th=[   62],
	 | 30.00th=[   62], 40.00th=[   63], 50.00th=[   65], 60.00th=[   68],
	 | 70.00th=[   73], 80.00th=[   89], 90.00th=[   94], 95.00th=[  153],
	 | 99.00th=[  330], 99.50th=[  408], 99.90th=[  594], 99.95th=[  742],
	 | 99.99th=[14877]
   bw (  MiB/s): min=   55, max= 3887, per=100.00%, avg=2692.26, stdev=1168.65, samples=85
   iops		: min=  222, max=15550, avg=10769.01, stdev=4674.61, samples=85
  lat (usec)   : 100=92.13%, 250=5.92%, 500=1.75%, 750=0.15%, 1000=0.02%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (msec)   : 500=0.01%, 2000=0.01%, >=2000=0.01%
  cpu		  : usr=2.28%, sys=57.60%, ctx=46299, majf=0, minf=406
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=0,465508,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=1940MiB/s (2034MB/s), 1940MiB/s-1940MiB/s (2034MB/s-2034MB/s), io=114GiB (122GB), run=60001-60001msec




FIO reading (NFS) with 16 IO depth: 335MB/s
Code:
[root@localhost test]# fio fio-seq-read.job
file1: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16
fio-3.1
Starting 1 process
file1: Laying out IO file (1 file / 10240MiB)
fio: native_fallocate call failed: Operation not supported
Jobs: 1 (f=1): [R(1)][100.0%][r=389MiB/s,w=0KiB/s][r=1556,w=0 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=12536: Sun Apr 22 21:56:00 2018
   read: IOPS=1279, BW=320MiB/s (335MB/s)(18.7GiB/60001msec)
	clat (usec): min=375, max=45016, avg=778.65, stdev=433.71
	 lat (usec): min=376, max=45017, avg=778.97, stdev=433.72
	clat percentiles (usec):
	 |  1.00th=[  433],  5.00th=[  482], 10.00th=[  515], 20.00th=[  603],
	 | 30.00th=[  701], 40.00th=[  750], 50.00th=[  775], 60.00th=[  799],
	 | 70.00th=[  832], 80.00th=[  889], 90.00th=[ 1090], 95.00th=[ 1156],
	 | 99.00th=[ 1221], 99.50th=[ 1254], 99.90th=[ 1549], 99.95th=[ 1827],
	 | 99.99th=[ 5080]
   bw (  KiB/s): min=198144, max=448512, per=99.98%, avg=327469.33, stdev=48157.88, samples=120
   iops		: min=  774, max= 1752, avg=1279.14, stdev=188.11, samples=120
  lat (usec)   : 500=7.69%, 750=33.41%, 1000=46.46%
  lat (msec)   : 2=12.40%, 4=0.02%, 10=0.01%, 50=0.01%
  cpu		  : usr=0.69%, sys=6.90%, ctx=228146, majf=0, minf=318
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=76766,0,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=320MiB/s (335MB/s), 320MiB/s-320MiB/s (335MB/s-335MB/s), io=18.7GiB (20.1GB), run=60001-60001msec






FreeNAS > CentOS (real hardware) (6.97 Gbits/sec)
Code:
--------------
Client connecting to 192.168.20.3, TCP port 5001
TCP window size: 32.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.20.10 port 35307 connected with 192.168.20.3 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  8.12 GBytes  6.97 Gbits/sec




CentOS (real hardware) > FreeNAS (9.41 Gbits/sec)
Code:
--------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  4] local 192.168.20.10 port 5001 connected with 192.168.20.3 port 44450
[ ID] Interval	   Transfer	 Bandwidth
[  4]  0.0-10.0 sec  11.0 GBytes  9.41 Gbits/sec


Little bit weird that the writing is able to go over the shannon limit, reads are however still around 350MBs which seems to be some magical cap.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
Little bit weird that the writing is able to go over the shannon limit, reads are however still around 350MBs which seems to be some magical cap.

You said you did no network tuning on the FreeNAS side?
 

Eagleman

Dabbler
Joined
Jan 31, 2014
Messages
17
You said you did no network tuning on the FreeNAS side?

Correct, I tried network tuning before and saw no inprovements. This test was done on FreeNAS with no tunables set and half an hour old installation of CentOS.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
Correct, I tried network tuning before and saw no inprovements. This test was done on FreeNAS with no tunables set and half an hour old installation of CentOS.

So I have a box in my lab with nearly the same configuration.. 11.1U4, a 7 disk Samsung SSD pool with a 900P as an slog.
my pool is RAIDz2, but that shouldn't matter for the purposes of this test, as either configuration should outperform a 10G link.

I built a Ubuntu 16.04 machine on the ESX host (sorry, I'm not a centos guy) and did a couple tests reading and writing to the VM's disk.

First was a basic DD:
Code:
root@dhcp-179:/var/tmp# dd if=/dev/zero of=file bs=1000k count=8000
8000+0 records in
8000+0 records out
8192000000 bytes (8.2 GB, 7.6 GiB) copied, 7.06037 s, 1.2 GB/s
root@dhcp-179:/var/tmp#

Easy peasy.. zpool iostat -v esx 1 showed the writes on the pool:
Code:
										   capacity	 operations	bandwidth
pool									alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
esx									 1.51T  1.64T	 13  17.1K  1.75M  1.12G
  raidz2								1.51T  1.64T	 13  4.58K  1.75M  30.6M
	gptid/9e26afa4-15fd-11e8-902b-001b216ed0dc	  -	  -	 10  1.39K   284K  11.0M
	gptid/9e7a7583-15fd-11e8-902b-001b216ed0dc	  -	  -	 10  1.40K   280K  11.0M
	gptid/9edbe0d0-15fd-11e8-902b-001b216ed0dc	  -	  -	  9  1.38K   252K  11.1M
	gptid/9f2de5db-15fd-11e8-902b-001b216ed0dc	  -	  -	  9  1.39K   252K  11.0M
	gptid/9f89892a-15fd-11e8-902b-001b216ed0dc	  -	  -	  8  1.36K   232K  10.9M
	gptid/9fdb8c20-15fd-11e8-902b-001b216ed0dc	  -	  -	  8  1.35K   232K  10.8M
	gptid/a02bfdfe-15fd-11e8-902b-001b216ed0dc	  -	  -	  9  1.35K   260K  10.9M
logs										-	  -	  -	  -	  -	  -
  gpt/slog2							  192M  19.7G	  0  12.5K	  0  1.09G
--------------------------------------  -----  -----  -----  -----  -----  -----


Since the DD is only 8GB, the 900P absorbs all of the I/O.

I assume you're using the default example seq write jobs in fio when you're referencing fio fio-seq-write.job.

When I run the write test, systat -ifstat shows network load around 885 MB/s. fio shows about the same and iostat shows similar activity on the pool:
Code:
										   capacity	 operations	bandwidth
pool									alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
esx									 1.52T  1.64T	 12  14.7K  1.42M  1.24G
  raidz2								1.52T  1.64T	 12  10.5K  1.42M   455M
	gptid/9e26afa4-15fd-11e8-902b-001b216ed0dc	  -	  -	  9  5.44K   212K   101M
	gptid/9e7a7583-15fd-11e8-902b-001b216ed0dc	  -	  -	 10  5.44K   232K   101M
	gptid/9edbe0d0-15fd-11e8-902b-001b216ed0dc	  -	  -	  9  5.40K   232K   101M
	gptid/9f2de5db-15fd-11e8-902b-001b216ed0dc	  -	  -	  6  5.34K   176K   101M
	gptid/9f89892a-15fd-11e8-902b-001b216ed0dc	  -	  -	  5  5.40K   132K   101M
	gptid/9fdb8c20-15fd-11e8-902b-001b216ed0dc	  -	  -	  9  5.35K   220K   101M
	gptid/a02bfdfe-15fd-11e8-902b-001b216ed0dc	  -	  -	 10  5.29K   251K   101M
logs										-	  -	  -	  -	  -	  -
  gpt/slog2							 62.5M  19.8G	  0  4.16K	  0   817M
--------------------------------------  -----  -----  -----  -----  -----  -----

Output from the write job:
Code:
root@dhcp-179:/var/tmp# fio fio-seq-write.job
file1: (g=0): rw=write, bs=256K-256K/256K-256K/256K-256K, ioengine=libaio, iodepth=16
fio-2.2.10
Starting 1 process
file1: Laying out IO file(s) (1 file(s) / 10240MB)
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/940.0MB/0KB /s] [0/3760/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=2135: Mon Apr 23 23:44:34 2018
  write: io=733680MB, bw=834761KB/s, iops=3260, runt=900003msec
	slat (usec): min=73, max=430223, avg=297.33, stdev=2740.11
	clat (usec): min=1, max=487401, avg=4484.73, stdev=11596.58
	 lat (usec): min=80, max=487494, avg=4782.31, stdev=11957.43
	clat percentiles (usec):
	 |  1.00th=[ 1224],  5.00th=[ 1256], 10.00th=[ 1304], 20.00th=[ 1368],
	 | 30.00th=[ 1432], 40.00th=[ 1528], 50.00th=[ 1752], 60.00th=[ 5472],
	 | 70.00th=[ 7200], 80.00th=[ 7392], 90.00th=[ 8160], 95.00th=[ 9920],
	 | 99.00th=[11328], 99.50th=[11584], 99.90th=[19584], 99.95th=[374784],
	 | 99.99th=[440320]
	bw (KB  /s): min=121612, max=1829227, per=100.00%, avg=839692.05, stdev=230349.62
	lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 100=0.01%, 250=0.01%
	lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
	lat (msec) : 2=53.27%, 4=5.41%, 10=36.48%, 20=4.72%, 50=0.01%
	lat (msec) : 100=0.01%, 250=0.01%, 500=0.08%
  cpu		  : usr=1.91%, sys=32.67%, ctx=149978, majf=0, minf=13
  IO depths	: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued	: total=r=0/w=2934718/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: io=733680MB, aggrb=834761KB/s, minb=834761KB/s, maxb=834761KB/s, mint=900003msec, maxt=900003msec

Disk stats (read/write):
	dm-0: ios=235/791336, merge=0/0, ticks=18316/137919744, in_queue=137947628, util=99.79%, aggrios=170/746332, aggrmerge=64/63160, aggrticks=13228/125360884, aggrin_queue=125385844, aggrutil=99.79%
  sda: ios=170/746332, merge=64/63160, ticks=13228/125360884, in_queue=125385844, util=99.79%
root@dhcp-179:/var/tmp#


fio fio-seq-read.job looks like this:
Code:
root@dhcp-179:/var/tmp# fio fio-seq-read.job
file1: (g=0): rw=read, bs=256K-256K/256K-256K/256K-256K, ioengine=libaio, iodepth=16
fio-2.2.10
Starting 1 process
file1: Laying out IO file(s) (1 file(s) / 10240MB)
Jobs: 1 (f=1): [R(1)] [100.0% done] [650.8MB/0KB/0KB /s] [2603/0/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=2149: Tue Apr 24 00:00:49 2018
  read : io=584719MB, bw=665275KB/s, iops=2598, runt=900006msec
	slat (usec): min=12, max=863, avg=20.08, stdev= 8.73
	clat (usec): min=527, max=130774, avg=6134.98, stdev=571.15
	 lat (usec): min=552, max=130789, avg=6155.24, stdev=570.34
	clat percentiles (usec):
	 |  1.00th=[ 5600],  5.00th=[ 5728], 10.00th=[ 5728], 20.00th=[ 5728],
	 | 30.00th=[ 5856], 40.00th=[ 6048], 50.00th=[ 6112], 60.00th=[ 6112],
	 | 70.00th=[ 6432], 80.00th=[ 6496], 90.00th=[ 6496], 95.00th=[ 6560],
	 | 99.00th=[ 6752], 99.50th=[ 6880], 99.90th=[ 8768], 99.95th=[10048],
	 | 99.99th=[12736]
	bw (KB  /s): min=432128, max=668160, per=100.00%, avg=665760.97, stdev=10323.93
	lat (usec) : 750=0.01%, 1000=0.01%
	lat (msec) : 2=0.01%, 4=0.02%, 10=99.92%, 20=0.05%, 50=0.01%
	lat (msec) : 250=0.01%
  cpu		  : usr=0.80%, sys=6.46%, ctx=1563541, majf=0, minf=1034
  IO depths	: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued	: total=r=2338874/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
	 latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: io=584719MB, aggrb=665275KB/s, minb=665275KB/s, maxb=665275KB/s, mint=900006msec, maxt=900006msec

Disk stats (read/write):
	dm-0: ios=2338571/94, merge=0/0, ticks=14298068/376, in_queue=14298824, util=100.00%, aggrios=2338874/18, aggrmerge=0/76, aggrticks=14296596/72, aggrin_queue=14296172, aggrutil=100.00%
  sda: ios=2338874/18, merge=0/76, ticks=14296596/72, in_queue=14296172, util=100.00%
root@dhcp-179:/var/tmp#


As far as iperf:
Code:
root@dhcp-179:/var/tmp# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.179 port 5001 connected with 192.168.1.51 port 34230
[ ID] Interval	   Transfer	 Bandwidth
[  4]  0.0-10.0 sec  10.8 GBytes  9.24 Gbits/sec
^Croot@dhcp-179:/var/tmp# iperf -c 192.168.1.51
------------------------------------------------------------
Client connecting to 192.168.1.51, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.179 port 34978 connected with 192.168.1.51 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  10.7 GBytes  9.19 Gbits/sec
root@dhcp-179:/var/tmp#


Off the top of my head, the differences between our two setups are, my network passes through a Dell S4048 switch rather than direct connect between the ESX hosts and FreeNAS.. I use 48 NFS threads.. I have atime turned off on the volume (dunno if you do?) and I have some tuning in place for the network cards and TCP stack.

The drop on the reads is interesting, as I don't remember seeing it when we tested this drive configuration with earlier versions of 11.1.
 
Last edited:
Status
Not open for further replies.
Top