Looking for advice

faktorqm · Oct 26, 2022

Dear forum, I'm looking for an advice about what choice is best for my home lab. As it is a home lab, I generally want to go cheap, and downtimes can be bigger as weeks :D

Currently I have TrueNAS with server grade hardware. Super micro X10SLL-F, 32Gb RAM DDR3 ECC, Xeon 1230v3, LSI 9207-8i, Fujitsu D2755 (2 x 10Gbe), 5 x 4TB WD RED (no SMR), 6 x 1TB SSD consumer grade, 1 X 256GB Kingston A400 for SO, also a PCI express to NVME converter, with a WD SN750 Black. I have a RAIDZ2 with the 5 WD RED disks. The NVME and the SSDs currently and not configured.

The 2 X 10Gbe are connected directly to a Vmware box with consumer grade hardware, and the same network card using a twinax cable. In this way, I did a setup for a iSCSI multipath.

Currently, and only for testing purposes, I have a Zvol created on the RAIDZ2 in order to share a block device with iSCSI. Everything works almost as expected, the only thing I need to review is when I restart the vmware exsi server, the iSCSI configuration stops working, but this is a topic for another forum :)

Reading several topics on this forums, I found a lot of information about ZFS and how to do improvements, however it's not always clear for my case.

I read that there is better for block sharing, use a pool composed by mirrored vdevs. So in my case, I will need to make a new pool composed of 3 mirrors. That should give a pool of 3TB.

The use of the RAIDZ2 pool is primarily for files, backups, installers, etc. and is shared trough SMB protocol. It's not neccesary for a improvement because as it is it's ok for my low requirements, but just for curiosity I would like to know if I can do something to improve performance (more for educational purposes than gain performance).

My main question is about the NVME disk. Currently I have the default configuration regarding ZIL, L2ARC and so on. Reading the posts and experiences I have found that (always regarding iscsi and vmware):
best performance -> disable sync writes -> information in danger
lower performance -> enable sync writes -> information is safe

As my RAM is under 64Gb (half of recommended for block sharing according to the path to success post) you will recommend me to enable SLOG on the NVME disk? L2ARC to improve read? something to improve write? I'm not looking for the best ever performance. It's just a home lab and if the speed is low, it is what it is for the given budget (a.k.a. I have a wife). But, if I can do immprovements with hardware that I already got (or maybe, buying the optane or the R200) I want to learn how to do it.

Thank you for reading! :)

Syptec · Oct 26, 2022

What is the intended use case for your "home lab"?

faktorqm · Oct 26, 2022

Regarding the iSCSI, I just want to have a windows server, a windows client, a torrent box probably in debian, a gitlab server, a next cloud with onlyoffice, maybe a docker server, one or two database server, one bitwarden to save passwords, one linux client, that kind of things. not all running at same time.
I know that my hardware is limited, but my main question remains on what can I do to maximize the hardware performance. Maybe setting one SLOG on a regular SSD and L2ARC on the NVME I can gain something.

Regarding the RAIDZ2, I only have backup files and movie sharing, and a jail with Plex to cast series. Thanks!

Syptec · Oct 26, 2022

All this on a single pool? no zVol/Datasets? Are you seeing limits on speed/performance presently? Is this on a UPS of have battery present for Controllers to finish writes?

Syptec · Oct 26, 2022

Will you run some Benchmark for for Pool just to give an idea? 1st on the NAS, then on the box running the iSCSI or VM?

Run both. The 2nd will let things get up to Temp and add a bit of stress over all. If your iSCSI is Windows. Try winsat disk and winsat formal.

Here are the tests -- (adjust as your system calls for.) This will crush a production environment while stress testing. Be warned.

SHORT TEST -
fio --filename=DOOKEYMONSTER --sync=1 --rw=randread --bs=64k --numjobs=4 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=30 --direct=1

LONG TEST -
fio --filename=DOOKEYiMONSTER --sync=1 --rw=randrw --bs=64k --numjobs=32 --iodepth=256 --group_reporting --name=test --filesize=2T --runtime=600 --direct=1

Syptec · Oct 26, 2022

*** Note run the FIO test in /mnt/TANK or where the actual storage is located.

faktorqm · Oct 26, 2022

Syptec said:
All this on a single pool? no zVol/Datasets? Are you seeing limits on speed/performance presently? Is this on a UPS of have battery present for Controllers to finish writes?

Hi! Currently, there is only one raidZ2 pool with 5 x 4TB (result size 11TB). I have 4 Datasets and 1 Zvol. Two datasets are for backup files. The third one is for my files, and the last one is for movies. The Zvol is served through iSCSI for the vmware esxi server. But the plan is to delete it from the RAIDZ2 pool and move it to a pool of mirrors (a new pool made by 3x 1TB mirrored vdevs).

Will try to do that and post the results. Thanks!

Syptec · Oct 26, 2022

I will assume 10GB connections from above for the iSCSI. You likely will not see speed/performance over the 10G connection as it is going to max out about 1000MB (that is 10G) +/- for overhead and other background processes. If the intent is to better the performance via SSD, likely it will not happen, the bottleneck is the 10G. The performance gain in SSD will be the time to seek read/write but not in MBps.

All based on my own assumption.

Have you consided a LAGG using LACP to get 20G or an upgraded card in the NAS and ESXi to improve what may prove to be the bottleneck?

Awaiting the FIO/Winsat.

jgreco · Oct 26, 2022

Syptec said:
Have you consided a LAGG using LACP to get 20G

That's not how LACP works.

Resource - LACP ... friend or foe?

Too many LACP discussions lately. Here's what you need to know. 1) Link aggregation groups are an interesting way to boost network throughput. They're typically managed by a protocol called Link Aggregation Control Protocol (LACP). Link...

www.truenas.com

Syptec said:
and ESXi

Or ESXi for that matter.

faktorqm · Oct 26, 2022

Yes, that's why I did iSCSI multipath and not LACP/LAGG. Set MTU to 9000, same iscsi portal for both IPs, Round robin from 1, etc.

I can consider a different set of things:
- Change the network to 2 x 40GBe
- Add Optane/R200/SSDs for SLOG or L2ARC
- Confirm I reached top performance for the humble hardware and that's it xD

faktorqm · Oct 27, 2022

OK Then.

From TRUENAS (with 10G size):

Code:

test: (g=0): rw=randread, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=psync, iodepth=4
...
fio-3.27
Starting 4 processes
test: Laying out IO file (1 file / 10240MiB)

test: (groupid=0, jobs=4): err= 0: pid=53425: Thu Oct 27 13:34:27 2022
  read: IOPS=72.6k, BW=4536MiB/s (4756MB/s)(40.0GiB/9030msec)
    clat (usec): min=4, max=3073, avg=52.85, stdev=36.19
     lat (usec): min=4, max=3073, avg=52.94, stdev=36.20
    clat percentiles (usec):
     |  1.00th=[   13],  5.00th=[   18], 10.00th=[   24], 20.00th=[   39],
     | 30.00th=[   43], 40.00th=[   47], 50.00th=[   50], 60.00th=[   55],
     | 70.00th=[   61], 80.00th=[   69], 90.00th=[   76], 95.00th=[   82],
     | 99.00th=[  124], 99.50th=[  145], 99.90th=[  725], 99.95th=[  840],
     | 99.99th=[ 1012]
   bw (  MiB/s): min= 3929, max= 5659, per=100.00%, avg=4553.54, stdev=119.10, samples=68
   iops        : min=62870, max=90552, avg=72854.76, stdev=1905.58, samples=68
  lat (usec)   : 10=0.46%, 20=6.98%, 50=43.49%, 100=46.91%, 250=1.95%
  lat (usec)   : 500=0.07%, 750=0.04%, 1000=0.08%
  lat (msec)   : 2=0.01%, 4=0.01%
  cpu          : usr=2.42%, sys=94.97%, ctx=20014, majf=0, minf=64
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=655360,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4

Run status group 0 (all jobs):
   READ: bw=4536MiB/s (4756MB/s), 4536MiB/s-4536MiB/s (4756MB/s-4756MB/s), io=40.0GiB (42.9GB), run=9030-9030msec

From TRUENAS (with 4G size):

NOTE: I have to use 4G because the Linux VM I already have working has an assigned hdd of 8G, so to compare with the same options I repeated the test after test the VM)

Code:

test: (g=0): rw=randread, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=psync, iodepth=4
...
fio-3.27
Starting 4 processes

test: (groupid=0, jobs=4): err= 0: pid=88079: Thu Oct 27 15:27:52 2022
  read: IOPS=81.4k, BW=5088MiB/s (5335MB/s)(16.0GiB/3220msec)
    clat (usec): min=4, max=469, avg=45.52, stdev=16.51
     lat (usec): min=4, max=469, avg=45.59, stdev=16.53
    clat percentiles (usec):
     |  1.00th=[   10],  5.00th=[   16], 10.00th=[   20], 20.00th=[   34],
     | 30.00th=[   41], 40.00th=[   45], 50.00th=[   48], 60.00th=[   50],
     | 70.00th=[   53], 80.00th=[   58], 90.00th=[   69], 95.00th=[   73],
     | 99.00th=[   80], 99.50th=[   83], 99.90th=[   90], 99.95th=[   93],
     | 99.99th=[  108]
   bw (  MiB/s): min= 4358, max= 5945, per=100.00%, avg=5182.06, stdev=147.59, samples=22
   iops        : min=69738, max=95128, avg=82911.00, stdev=2361.42, samples=22
  lat (usec)   : 10=1.13%, 20=9.39%, 50=49.85%, 100=39.60%, 250=0.02%
  lat (usec)   : 500=0.01%
  cpu          : usr=2.52%, sys=97.44%, ctx=338, majf=0, minf=64
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4

Run status group 0 (all jobs):
   READ: bw=5088MiB/s (5335MB/s), 5088MiB/s-5088MiB/s (5335MB/s-5335MB/s), io=16.0GiB (17.2GB), run=3220-3220msec

FROM Debian VM on the esxi (I was not able to run fio from the console) (4G file size):

I will post the 2T results in the next post.

faktorqm · Oct 27, 2022

OK Then it tooks years to end...

Code:

test: (g=0): rw=randrw, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=psync, iodepth=256
...
fio-3.27
Starting 32 processes
test: Laying out IO file (1 file / 2097152MiB)

test: (groupid=0, jobs=32): err= 0: pid=50475: Thu Oct 27 18:58:41 2022
  read: IOPS=23, BW=1495KiB/s (1530kB/s)(877MiB/600600msec)
    clat (usec): min=16, max=3794.0k, avg=501697.83, stdev=212689.37
     lat (usec): min=17, max=3794.0k, avg=501698.34, stdev=212689.38
    clat percentiles (msec):
     |  1.00th=[   59],  5.00th=[  165], 10.00th=[  234], 20.00th=[  326],
     | 30.00th=[  397], 40.00th=[  451], 50.00th=[  498], 60.00th=[  542],
     | 70.00th=[  600], 80.00th=[  667], 90.00th=[  760], 95.00th=[  860],
     | 99.00th=[ 1036], 99.50th=[ 1167], 99.90th=[ 1485], 99.95th=[ 1653],
     | 99.99th=[ 1905]
   bw (  KiB/s): min= 3968, max=10539, per=100.00%, avg=4383.64, stdev=36.27, samples=12910
   iops        : min=   32, max=  148, avg=41.30, stdev= 0.65, samples=12910
  write: IOPS=23, BW=1500KiB/s (1536kB/s)(880MiB/600600msec); 0 zone resets
    clat (msec): min=98, max=3489, avg=857.80, stdev=230.86
     lat (msec): min=98, max=3489, avg=857.80, stdev=230.86
    clat percentiles (msec):
     |  1.00th=[  355],  5.00th=[  493], 10.00th=[  575], 20.00th=[  667],
     | 30.00th=[  735], 40.00th=[  793], 50.00th=[  852], 60.00th=[  911],
     | 70.00th=[  969], 80.00th=[ 1045], 90.00th=[ 1150], 95.00th=[ 1234],
     | 99.00th=[ 1435], 99.50th=[ 1502], 99.90th=[ 1871], 99.95th=[ 2106],
     | 99.99th=[ 3473]
   bw (  KiB/s): min= 3968, max= 7869, per=100.00%, avg=4059.39, stdev= 9.37, samples=13985
   iops        : min=   32, max=  102, avg=36.23, stdev= 0.36, samples=13985
  lat (usec)   : 20=0.02%, 50=0.10%, 100=0.11%, 250=0.01%
  lat (msec)   : 50=0.17%, 100=0.74%, 250=4.75%, 500=21.87%, 750=32.90%
  lat (msec)   : 1000=25.74%, 2000=13.56%, >=2000=0.04%
  cpu          : usr=0.00%, sys=0.02%, ctx=101740, majf=2, minf=71
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=14026,14077,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
   READ: bw=1495KiB/s (1530kB/s), 1495KiB/s-1495KiB/s (1530kB/s-1530kB/s), io=877MiB (919MB), run=600600-600600msec
  WRITE: bw=1500KiB/s (1536kB/s), 1500KiB/s-1500KiB/s (1536kB/s-1536kB/s), io=880MiB (923MB), run=600600-600600msec

Thanks for your input! ;)

faktorqm · Oct 27, 2022

Syptec said:
All this on a single pool? no zVol/Datasets? Are you seeing limits on speed/performance presently? Is this on a UPS of have battery present for Controllers to finish writes?

Currently UPS, no controller with battery.

jgreco · Oct 27, 2022

Syptec said:
UPS of have battery present for Controllers to finish writes?

This is not necessary with ZFS. As long as the controller doesn't reorder writes (see the various discussions about RAID controllers, which are evil for ZFS), ZFS guarantees that writes are sent to the drives in the proper order to ensure data integrity.

Syptec · Oct 27, 2022

jgreco said:
This is not necessary with ZFS. As long as the controller doesn't reorder writes (see the various discussions about RAID controllers, which are evil for ZFS), ZFS guarantees that writes are sent to the drives in the proper order to ensure data integrity.

I was referring to Battery/UPS. Due to suggesting that not having one and having sync disabled is a bad idea. Not referring to using RAID Cards/HBA and ZFS together. Will be clearing that up in future responses.

jgreco · Oct 27, 2022

Syptec said:
not having one and having sync disabled is a bad idea.

I feel like I'm talking to a brick wall. I just finished saying

jgreco said:
As long as the controller doesn't reorder writes, ZFS guarantees that writes are sent to the drives in the proper order to ensure data integrity.

faktorqm · Oct 28, 2022

So as I have UPS you recommend to disable sync=always?

Any advice based on the fio results? Thank you! :)

jgreco · Oct 28, 2022

faktorqm said:
So as I have UPS you recommend to disable sync=always?

Having a UPS is not particularly relevant. It is just protecting the server from brownouts and line noise. A server can panic and crash, overheat and crash, and/or die in many other ways.

The setting for sync= should be determined by your workload and pool design. If you are handling banking transactions, virtual machine disk files for a remote hypervisor, and other situations with no tolerance for data loss/corruption, and you have an appropriate SLOG device, then you can set sync=always. Otherwise you should probably be running the default setting.

These are obviously at least somewhat related, in that if you have a use case that benefits from or requires sync=always, you would probably also want to have a UPS, just to help keep the availability of the system closer to 100%.

When you do not have sync=always, what happens is that the transaction groups (txg's) being built and/or flushed in main memory will be lost if the server loses power, crashes, panics, or is hard rebooted. However, a txg, once fully committed to disk, is guaranteed to leave the pool in a state of integrity and consistency.

That's why your question

faktorqm said:
So as I have UPS you recommend to disable sync=always?

is a bit of a boggle. Having a UPS doesn't magically make it safe to disable sync=always. It would be necessary to consider why someone thought sync=always was necessary in order to understand the risks.

It's okay if that makes your head hurt trying to wrap your brain around it. It's a bit abstract.

faktorqm · Oct 28, 2022

Thanks a lot! will you recommend the wd black sn750 as SLOG? I mean, you said "appropiate SLOG device" but I'm not pretty sure what device it is.

I'm having fun with my homelab, it's true that I do not want to reinstall a VM every month, but If I lose anything, should not be a big problem. Also I have backups of the VMs so I'm not so reckless :P

Returning to my main question: Can I do something to improve the performance or my homelab is under the "normal" speed parameters and that's it?

Thanks again for your time and explanations!

Syptec · Oct 28, 2022

Can you repeat for FIO test and while it is running, do gstat -f da and watch the result. How saturated is your system? I suspect at near 5G your drives are maxed...

Important Announcement for the TrueNAS Community.

Looking for advice

Dabbler

Dabbler

Dabbler

Dabbler

Dabbler

Dabbler

Dabbler

Dabbler

Resident Grinch

Dabbler

Dabbler

Dabbler

Dabbler

Resident Grinch

Dabbler

Resident Grinch

Dabbler

Resident Grinch

Dabbler

Dabbler

Similar threads