I re-ran the test on TrueNAS Core and the results are dramatically better:
Code:
fio --name=random-write --eta-newline=5s --filename=temp.file --ioengine=posixaio --rw=write --numjob=1 --size=2g --io_size=10g --iodepth=32 --runtime=60 --direct=1 --fsync=10000
Code:
random-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=32
fio-3.27
Starting 1 process
random-write: Laying out IO file (1 file / 2048MiB)
Jobs: 1 (f=1): [W(1)][19.4%][w=182MiB/s][w=46.7k IOPS][eta 00m:29s]
Jobs: 1 (f=1): [W(1)][25.0%][w=172MiB/s][w=43.9k IOPS][eta 00m:36s]
Jobs: 1 (f=1): [W(1)][32.7%][w=161MiB/s][w=41.2k IOPS][eta 00m:35s]
Jobs: 1 (f=1): [W(1)][42.6%][w=127MiB/s][w=32.6k IOPS][eta 00m:31s]
Jobs: 1 (f=1): [W(1)][51.8%][w=124MiB/s][w=31.7k IOPS][eta 00m:27s]
Jobs: 1 (f=1): [W(1)][60.7%][w=177MiB/s][w=45.3k IOPS][eta 00m:22s]
Jobs: 1 (f=1): [W(1)][70.2%][w=178MiB/s][w=45.6k IOPS][eta 00m:17s]
Jobs: 1 (f=1): [W(1)][79.3%][w=150MiB/s][w=38.5k IOPS][eta 00m:12s]
Jobs: 1 (f=1): [W(1)][88.1%][w=185MiB/s][w=47.3k IOPS][eta 00m:07s]
Jobs: 1 (f=1): [W(1)][98.3%][w=177MiB/s][w=45.2k IOPS][eta 00m:01s]
Jobs: 1 (f=1): [W(1)][100.0%][w=174MiB/s][w=44.6k IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=1696: Fri Mar 18 20:33:27 2022
write: IOPS=44.7k, BW=175MiB/s (183MB/s)(10.0GiB/58680msec); 0 zone resets
slat (nsec): min=1248, max=48665k, avg=6192.59, stdev=43464.73
clat (usec): min=22, max=450050, avg=669.19, stdev=2116.12
lat (usec): min=30, max=450067, avg=675.38, stdev=2116.74
clat percentiles (usec):
| 1.00th=[ 72], 5.00th=[ 109], 10.00th=[ 147], 20.00th=[ 289],
| 30.00th=[ 537], 40.00th=[ 660], 50.00th=[ 709], 60.00th=[ 742],
| 70.00th=[ 775], 80.00th=[ 816], 90.00th=[ 881], 95.00th=[ 988],
| 99.00th=[ 1401], 99.50th=[ 1663], 99.90th=[10814], 99.95th=[32637],
| 99.99th=[83362]
bw ( KiB/s): min=44776, max=477011, per=100.00%, avg=178846.33, stdev=57449.87, samples=116
iops : min=11194, max=119252, avg=44711.30, stdev=14362.40, samples=116
lat (usec) : 50=0.14%, 100=3.70%, 250=13.83%, 500=10.66%, 750=33.91%
lat (usec) : 1000=33.09%
lat (msec) : 2=4.38%, 4=0.13%, 10=0.06%, 20=0.03%, 50=0.04%
lat (msec) : 100=0.02%, 250=0.01%, 500=0.01%
fsync/fdatasync/sync_file_range:
sync (usec): min=38, max=6428, avg=506.85, stdev=513.63
sync percentiles (usec):
| 1.00th=[ 55], 5.00th=[ 72], 10.00th=[ 89], 20.00th=[ 145],
| 30.00th=[ 251], 40.00th=[ 343], 50.00th=[ 441], 60.00th=[ 553],
| 70.00th=[ 652], 80.00th=[ 750], 90.00th=[ 857], 95.00th=[ 971],
| 99.00th=[ 2737], 99.50th=[ 4621], 99.90th=[ 6128], 99.95th=[ 6325],
| 99.99th=[ 6456]
cpu : usr=21.87%, sys=51.07%, ctx=733769, majf=0, minf=1
IO depths : 1=0.1%, 2=0.5%, 4=1.8%, 8=4.0%, 16=66.3%, 32=27.6%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=95.7%, 8=2.9%, 16=1.0%, 32=0.5%, 64=0.0%, >=64=0.0%
issued rwts: total=0,2621440,0,8581 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=175MiB/s (183MB/s), 175MiB/s-175MiB/s (183MB/s-183MB/s), io=10.0GiB (10.7GB), run=58680-58680msec
Re-running using the posixaio library on scale:
Code:
random-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=32
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [W(1)][11.7%][w=13.5MiB/s][w=3456 IOPS][eta 00m:53s]
Jobs: 1 (f=1): [W(1)][21.7%][w=29.9MiB/s][w=7655 IOPS][eta 00m:47s]
Jobs: 1 (f=1): [W(1)][31.7%][w=17.9MiB/s][w=4576 IOPS][eta 00m:41s]
Jobs: 1 (f=1): [W(1)][41.7%][w=11.5MiB/s][w=2946 IOPS][eta 00m:35s]
Jobs: 1 (f=1): [W(1)][51.7%][w=17.1MiB/s][w=4384 IOPS][eta 00m:29s]
Jobs: 1 (f=1): [W(1)][61.7%][w=22.6MiB/s][w=5792 IOPS][eta 00m:23s]
Jobs: 1 (f=1): [W(1)][71.7%][w=15.4MiB/s][w=3939 IOPS][eta 00m:17s]
Jobs: 1 (f=1): [W(1)][81.7%][w=29.9MiB/s][w=7660 IOPS][eta 00m:11s]
Jobs: 1 (f=1): [W(1)][91.7%][w=11.5MiB/s][w=2944 IOPS][eta 00m:05s]
Jobs: 1 (f=1): [W(1)][100.0%][w=14.4MiB/s][w=3680 IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=9875: Fri Mar 18 20:43:18 2022
write: IOPS=7222, BW=28.2MiB/s (29.6MB/s)(1693MiB/60004msec); 0 zone resets
slat (nsec): min=122, max=219996, avg=376.58, stdev=1030.53
clat (usec): min=9, max=168436, avg=4412.21, stdev=9600.21
lat (usec): min=10, max=168436, avg=4412.59, stdev=9600.25
clat percentiles (usec):
| 1.00th=[ 562], 5.00th=[ 594], 10.00th=[ 619], 20.00th=[ 644],
| 30.00th=[ 676], 40.00th=[ 717], 50.00th=[ 758], 60.00th=[ 816],
| 70.00th=[ 914], 80.00th=[ 3490], 90.00th=[ 13566], 95.00th=[ 24773],
| 99.00th=[ 44827], 99.50th=[ 57410], 99.90th=[ 83362], 99.95th=[ 99091],
| 99.99th=[120062]
bw ( KiB/s): min= 3840, max=145152, per=100.00%, avg=28992.54, stdev=29636.75, samples=119
iops : min= 960, max=36288, avg=7248.13, stdev=7409.19, samples=119
lat (usec) : 10=0.01%, 20=0.05%, 50=0.04%, 100=0.03%, 250=0.01%
lat (usec) : 500=0.01%, 750=47.55%, 1000=25.85%
lat (msec) : 2=3.33%, 4=3.61%, 10=4.71%, 20=8.14%, 50=5.91%
lat (msec) : 100=0.72%, 250=0.04%
fsync/fdatasync/sync_file_range:
sync (usec): min=8, max=35020, avg=967.96, stdev=4139.02
sync percentiles (usec):
| 1.00th=[ 12], 5.00th=[ 18], 10.00th=[ 21], 20.00th=[ 26],
| 30.00th=[ 31], 40.00th=[ 38], 50.00th=[ 48], 60.00th=[ 61],
| 70.00th=[ 76], 80.00th=[ 578], 90.00th=[ 717], 95.00th=[ 3032],
| 99.00th=[15795], 99.50th=[34866], 99.90th=[34866], 99.95th=[34866],
| 99.99th=[34866]
cpu : usr=1.25%, sys=0.21%, ctx=22896, majf=6, minf=48
IO depths : 1=0.1%, 2=1.0%, 4=7.0%, 8=23.8%, 16=57.6%, 32=11.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=96.9%, 8=0.2%, 16=0.1%, 32=2.8%, 64=0.0%, >=64=0.0%
issued rwts: total=0,433377,0,1666 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=28.2MiB/s (29.6MB/s), 28.2MiB/s-28.2MiB/s (29.6MB/s-29.6MB/s), io=1693MiB (1775MB), run=60004-60004msec
EDIT: Switching the SATA controller from "Legacy" in the BIOS (so that I could boot from the optical SATA port) to AHCI, and installing Scale on USB thumb drives, performance saw an increase:
Code:
random-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=32
fio-3.25
Starting 1 process
random-write: Laying out IO file (1 file / 2048MiB)
Jobs: 1 (f=1): [W(1)][17.9%][w=151MiB/s][w=38.7k IOPS][eta 00m:32s]
Jobs: 1 (f=1): [W(1)][25.5%][w=143MiB/s][w=36.7k IOPS][eta 00m:38s]
Jobs: 1 (f=1): [W(1)][32.8%][w=72.4MiB/s][w=18.5k IOPS][eta 00m:39s]
Jobs: 1 (f=1): [W(1)][41.7%][w=134MiB/s][w=34.4k IOPS][eta 00m:35s]
Jobs: 1 (f=1): [W(1)][51.7%][w=148MiB/s][w=37.8k IOPS][eta 00m:29s]
Jobs: 1 (f=1): [W(1)][61.7%][w=143MiB/s][w=36.7k IOPS][eta 00m:23s]
Jobs: 1 (f=1): [W(1)][71.7%][w=149MiB/s][w=38.1k IOPS][eta 00m:17s]
Jobs: 1 (f=1): [W(1)][81.7%][w=132MiB/s][w=33.8k IOPS][eta 00m:11s]
Jobs: 1 (f=1): [W(1)][91.7%][w=120MiB/s][w=30.7k IOPS][eta 00m:05s]
Jobs: 1 (f=1): [W(1)][100.0%][w=150MiB/s][w=38.4k IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=9716: Fri Mar 18 23:19:16 2022
write: IOPS=37.7k, BW=147MiB/s (155MB/s)(8846MiB/60001msec); 0 zone resets
slat (nsec): min=123, max=245296, avg=434.23, stdev=925.35
clat (usec): min=10, max=14211, avg=838.19, stdev=673.83
lat (usec): min=10, max=14211, avg=838.62, stdev=673.85
clat percentiles (usec):
| 1.00th=[ 155], 5.00th=[ 180], 10.00th=[ 202], 20.00th=[ 273],
| 30.00th=[ 461], 40.00th=[ 644], 50.00th=[ 783], 60.00th=[ 889],
| 70.00th=[ 979], 80.00th=[ 1090], 90.00th=[ 1418], 95.00th=[ 1844],
| 99.00th=[ 3982], 99.50th=[ 4146], 99.90th=[ 4359], 99.95th=[ 4490],
| 99.99th=[ 5276]
bw ( KiB/s): min=29208, max=644352, per=100.00%, avg=150996.71, stdev=86341.93, samples=119
iops : min= 7302, max=161088, avg=37749.21, stdev=21585.48, samples=119
lat (usec) : 20=0.01%, 50=0.01%, 100=0.06%, 250=18.20%, 500=13.09%
lat (usec) : 750=14.92%, 1000=25.27%
lat (msec) : 2=24.32%, 4=3.21%, 10=0.92%, 20=0.01%
fsync/fdatasync/sync_file_range:
sync (usec): min=4, max=3950, avg=292.27, stdev=389.20
sync percentiles (usec):
| 1.00th=[ 12], 5.00th=[ 19], 10.00th=[ 24], 20.00th=[ 37],
| 30.00th=[ 63], 40.00th=[ 101], 50.00th=[ 163], 60.00th=[ 217],
| 70.00th=[ 314], 80.00th=[ 474], 90.00th=[ 717], 95.00th=[ 947],
| 99.00th=[ 2057], 99.50th=[ 2671], 99.90th=[ 3261], 99.95th=[ 3359],
| 99.99th=[ 3949]
cpu : usr=10.58%, sys=2.24%, ctx=379917, majf=6, minf=46
IO depths : 1=0.1%, 2=0.1%, 4=0.6%, 8=9.6%, 16=67.2%, 32=22.9%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=95.4%, 8=1.6%, 16=1.5%, 32=1.5%, 64=0.0%, >=64=0.0%
issued rwts: total=0,2264641,0,8014 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=147MiB/s (155MB/s), 147MiB/s-147MiB/s (155MB/s-155MB/s), io=8846MiB (9276MB), run=60001-60001msec
Enabling the setting "Drive Write Cache" in the BIOS seemed to boost performance as well:
Code:
random-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=32
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [W(1)][11.7%][eta 00m:53s]
Jobs: 1 (f=1): [W(1)][21.7%][w=177MiB/s][w=45.3k IOPS][eta 00m:47s]
Jobs: 1 (f=1): [W(1)][31.7%][eta 00m:41s]
Jobs: 1 (f=1): [W(1)][41.0%][w=301MiB/s][w=77.1k IOPS][eta 00m:36s]
Jobs: 1 (f=1): [W(1)][50.8%][w=173MiB/s][w=44.3k IOPS][eta 00m:30s]
Jobs: 1 (f=1): [W(1)][60.0%][w=183MiB/s][w=46.7k IOPS][eta 00m:24s]
Jobs: 1 (f=1): [W(1)][70.5%][w=179MiB/s][w=45.9k IOPS][eta 00m:18s]
Jobs: 1 (f=1): [W(1)][80.3%][w=175MiB/s][w=44.9k IOPS][eta 00m:12s]
Jobs: 1 (f=1): [W(1)][90.2%][w=174MiB/s][w=44.5k IOPS][eta 00m:06s]
Jobs: 1 (f=1): [W(1)][100.0%][w=171MiB/s][w=43.9k IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=11691: Fri Mar 18 22:37:21 2022
write: IOPS=37.0k, BW=148MiB/s (155MB/s)(8897MiB/60001msec); 0 zone resets
slat (nsec): min=120, max=230996, avg=394.22, stdev=772.39
clat (usec): min=9, max=1900.0k, avg=833.37, stdev=13955.21
lat (usec): min=9, max=1900.0k, avg=833.77, stdev=13955.21
clat percentiles (usec):
| 1.00th=[ 167], 5.00th=[ 194], 10.00th=[ 212], 20.00th=[ 265],
| 30.00th=[ 404], 40.00th=[ 529], 50.00th=[ 570], 60.00th=[ 627],
| 70.00th=[ 758], 80.00th=[ 906], 90.00th=[ 1106], 95.00th=[ 1385],
| 99.00th=[ 3720], 99.50th=[ 4047], 99.90th=[ 9241], 99.95th=[ 19530],
| 99.99th=[566232]
bw ( KiB/s): min= 256, max=563200, per=100.00%, avg=167112.44, stdev=90222.92, samples=108
iops : min= 64, max=140800, avg=41778.11, stdev=22555.73, samples=108
lat (usec) : 10=0.01%, 20=0.03%, 50=0.01%, 100=0.05%, 250=18.38%
lat (usec) : 500=15.99%, 750=35.21%, 1000=15.70%
lat (msec) : 2=12.72%, 4=1.37%, 10=0.44%, 20=0.04%, 50=0.03%
lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 2000=0.01%
fsync/fdatasync/sync_file_range:
sync (usec): min=5, max=3665, avg=217.48, stdev=267.10
sync percentiles (usec):
| 1.00th=[ 12], 5.00th=[ 18], 10.00th=[ 21], 20.00th=[ 28],
| 30.00th=[ 40], 40.00th=[ 64], 50.00th=[ 92], 60.00th=[ 169],
| 70.00th=[ 229], 80.00th=[ 408], 90.00th=[ 594], 95.00th=[ 717],
| 99.00th=[ 1237], 99.50th=[ 1385], 99.90th=[ 1614], 99.95th=[ 1647],
| 99.99th=[ 3654]
cpu : usr=8.59%, sys=1.72%, ctx=348556, majf=5, minf=49
IO depths : 1=0.1%, 2=0.2%, 4=2.1%, 8=13.2%, 16=64.6%, 32=20.3%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=95.9%, 8=1.0%, 16=1.3%, 32=1.8%, 64=0.0%, >=64=0.0%
issued rwts: total=0,2277556,0,8243 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=148MiB/s (155MB/s), 148MiB/s-148MiB/s (155MB/s-155MB/s), io=8897MiB (9329MB), run=60001-60001msec
So there must be a dramatic difference in the Linux driver for the B120i storage controller vs the one used in FreeBSD.