Hi,
sorry for fast&direct question :) than again:
I working with storage system longer, ZFS is challenge and now days not only o for high price enterprise ;)
All tests what I do are with fio tool, changing parameters in bold:
$ fio --filename=test_fio_file
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-te --size=100M
How are you getting UFS in FreeNAS?
I know that FreeNAS have nice GUI tool, but if you like to do more deep engineering investigation, console is a must ;) and newfs command is still part of FreeNAS with UFS support.
I do test on :
- mirror pool only with 2x Optanes 900p 280GB - ashift 12 and 13
- 2 stripped RAIDZ2 with 12x 4TB HDD (6x Thosiba, 6x HGST) over LSI SAS2308 IT mode connected to HP D2600 with SLOG mirror Optanes - ashift 12 and 13
- mirror(6-to-6) of 12x 4TB HDD (6x Thosiba, 6x HGST) over LSI SAS2308 IT mode connected to HP D2600 with SLOG mirror Optanes - ashift 12 and 13
- direct UFS on Optane (newfs & mount, no problem - with tunefs disabled/enabled TRIM, mounted with/without noatime, fs with 4/8/128kB BS)
Optane partition aligned to 128x512B , see first post
to mirror Optane pool:
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=100
Jobs: 100 (f=100): [W(100)][16.4%][r=0KiB/s,[B]w=16.6MiB/s][r=0,w=4259 IOPS [/B][eta 00m:00s]
to mirror Optane pool:
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=1
Jobs: 1 (f=1): [W(1)][18.0%][r=0KiB/s,w=20.0MiB/s][r=0,w=5370 IOPS[eta 00m:00s]
every time from
zpool iostat maxx 7-8k IOPS
capacity operations bandwidth
pool alloc free read write read write
-------------------------------------- ----- ----- ----- ----- ----- -----
local_nvme 1.07G 257G 0 6.79K 0 113M
mirror 1.07G 257G 0 6.79K 0 113M
gptid/51ddec4d-3460-11e9-926a-6805ca8ce59a - - 0 6.79K 0 113M
gptid/52ff0cf0-3460-11e9-926a-6805ca8ce59a - - 0 6.79K 0 113M
Based on your use of terms, ZIL, for example, I think it might help to ensure we are all using the same words if you review these guides:
I am looking ahead only for
SLOG and good _s_ync writes , async writes it handled perfectly, see end of this post.
On 2x stripped RAIDZ2 pool
with mirror NVMe SLOG :
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=100
Jobs: 100 (f=100): [W(100)][31.1%][r=0KiB/s,w=21.1MiB/s][r=0,w=5403 IOPS][eta 00m:00s]
capacity operations bandwidth
pool alloc free read write read write
-------------------------------------- ----- ----- ----- ----- ----- -----
sas1_pool 67.8G 43.4T 0 2.97K 0 119M
raidz2 33.9G 21.7T 0 0 0 0
gptid/2e8da3f4-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/30400089-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/32b78c12-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/3b7accf8-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/3d52a37e-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/4054cdec-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
raidz2 33.9G 21.7T 0 0 0 0
gptid/349e8c21-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/36e6f8b0-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/388513fa-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/42451792-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/450d9e70-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/4706ff42-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
logs - - - - - -
mirror 1.10G 257G 0
2.96K 0 118M
gptid/a3618eaf-3463-11e9-926a-6805ca8ce59a - - 0 2.96K 0 118M
gptid/a45f1ba2-3463-11e9-926a-6805ca8ce59a - - 0 2.96K 0 118M
-------------------------------------- ----- ----- ----- ----- ----- -----
On 2x stripped RAIDZ2 pool
with mirror NVMe SLOG :
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=1
Jobs: 1 (f=1): [W(1)][32.8%][r=0KiB/s,w=20.0MiB/s][r=0,w=5364 IOPS][eta 00m:00s]
capacity operations bandwidth
pool alloc free read write read write
-------------------------------------- ----- ----- ----- ----- ----- -----
sas1_pool 68.0G 43.4T 0 6.06K 0 96.9M
raidz2 34.0G 21.7T 0 0 0 0
gptid/2e8da3f4-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/30400089-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/32b78c12-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/3b7accf8-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/3d52a37e-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/4054cdec-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
raidz2 34.0G 21.7T 0 0 0 0
gptid/349e8c21-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/36e6f8b0-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/388513fa-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/42451792-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/450d9e70-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
gptid/4706ff42-3109-11e9-bff0-6805ca8ce59a - - 0 0 0 0
logs - - - - - -
mirror 505M 258G 0
6.05K 0 96.9M
gptid/a3618eaf-3463-11e9-926a-6805ca8ce59a - - 0 6.05K 0 96.9M
gptid/a45f1ba2-3463-11e9-926a-6805ca8ce59a - - 0 6.05K 0 96.9M
-------------------------------------- ----- ----- ----- ----- ----- -----
now UFS:
$ newfs -b 4096 gptid/a3618eaf-3463-11e9-926a-6805ca8ce59a
gptid/a3618eaf-3463-11e9-926a-6805ca8ce59a: 265042.9MB (542807808 sectors) block size
4096, fragment size 4096
using 5508 cylinder groups of 48.12MB, 12320 blks, 6160 inodes.
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=1
Jobs: 1 (f=1): [W(1)][34.4%][r=0KiB/s,w=55.9MiB/s][r=0,w=14.3k IOPS][eta 00m:00s]
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=100
Jobs: 100 (f=100): [W(100)][31.1%][r=0KiB/s,w=15.6MiB/s][r=0,w=3984 IOPS[eta 00m:00s]
$ newfs gptid/a3618eaf-3463-11e9-926a-6805ca8ce59a
gptid/a3618eaf-3463-11e9-926a-6805ca8ce59a: 265042.9MB (542807808 sectors) block size
32768, fragment size 4096
using 424 cylinder groups of 626.09MB, 20035 blks, 80256 inodes.
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=
100
Jobs: 100 (f=100): [W(100)][20.0%][r=0KiB/s,w=11.6MiB/s][r=0,w=2979 IOPS][eta 00m:00s]
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=
1
Jobs: 1 (f=1): [W(1)][19.7%][r=0KiB/s,w=20.0MiB/s][r=0,w=5126 IOPS][eta 00m:00s]
Now to partition direct write:
$ fio --filename=
/dev/gptid/a3618eaf-3463-11e9-926a-6805ca8ce59a --direct=1 --sync=1 --rwte --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-te --size=100M
Jobs: 1 (f=1): [W(1)][23.3%][r=0KiB/s,w=114MiB/s][r=0,w=29.1k IOPS][eta 00m:00s] >>
29.1k IOPS
Jobs: 10 (f=10): [W(10)][15.0%][r=0KiB/s,w=803MiB/s][r=0,w=206k IOPS][eta 00m:00s] >>
206k IOPS
Jobs: 100 (f=100): [W(100)][26.2%][r=0KiB/s,w=1246MiB/s][r=0,w=319k IOPS][eta 00m:00s] >>
319k IOPS
Yes, I know that UFS, ZFS, or SLOG, or whatever take some overhead, but is we compare 100jobs direct write to device to SLOG, UFS, or mirror pool with only cca 5k sync IOPS, it is
horrible
===========================================================
For comparsion
async direct writes:
On mirror NVMe pool:
--direct=1 --sync=0 --rw=write --bs=1M --numjobs=100
Jobs: 100 (f=100): [W(100)][24.6%][r=0KiB/s,w=3516MiB/s][r=0,w=3516 IOPS][eta 00m:00s]
On mirror NVMe pool:
--direct=1 --sync=0 --rw=write --bs=1M --numjobs=1
Jobs: 1 (f=1): [W(1)][16.7%][r=0KiB/s,w=1801MiB/s][r=0,w=1800 IOPS][eta 00m:00s]
On 2x stripped RAIDZ2 pool
without NVMe SLOG :
--direct=1 --sync=0 --rw=write --bs=1M --numjobs=1
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=2108MiB/s][r=0,w=2108 IOPS][eta 00m:00s] - zpool iostat showing cca 200MB/s write to pool
On 2x stripped RAIDZ2 pool
without NVMe SLOG :
--direct=1 --sync=0 --rw=write --bs=1M --numjobs=100
Jobs: 100 (f=100): [W(100)][90.0%][r=0KiB/s,w=5757MiB/s][r=0,w=5757 IOPS][eta 00m:00s] - zpool iostat showing cca 60MB/s write to pool - on 512GB RAM system no problem ;)
This async write performance hitting 10GB NFS without any problem, what we like to have:)
but sync write over NFS take only cca 2-3k IOPS per connection, more NFS connections in sum takes maxx 6-7k IOPS, sometimes hit 10k IOPS
NFS exported from ZFS pools (HDD pool with/without SLOG or mirror NVMe pool)