SSD Array Performance

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
BPN-SAS3-216A-N4 w/24 SSD & 3x 9340-8i (SAS3) - easy 1gb read/write via SMB. But over NFS my numbers still are low - 500/500. Ideas?


HP D3700 tray - works but need HP SAS3 to be able to correct a firmware missmatch on the I/O controllers - probably will resell - haven't gotten around to doing solid benchmarks - Dad brain has been mush recently.
 

Holt Andrei Tiberiu

Contributor
Joined
Jan 13, 2016
Messages
129
Until you do the firmware update, leave only 1 EMM in the shelf.


BPN-SAS3-216A-N4 w/24 SSD & 3x 9340-8i (SAS3) - easy 1gb read/write via SMB. But over NFS my numbers still are low - 500/500. Ideas?


HP D3700 tray - works but need HP SAS3 to be able to correct a firmware missmatch on the I/O controllers - probably will resell - haven't gotten around to doing solid benchmarks - Dad brain has been mush recently.
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
So - Here is where I am - I keep getting the same numbers no mater what hardware I've thrown at it. (24port to 1 expander gave similar numbers)


1556839592056.png


This is with 12vdev - Mirrored Intel s3700 (24x drives) - My raid2z had similar numbers. This has 3x SAS3 controllers - so card ONLY has 8x SSD drives per controller. I've put some heft behind it with dual E5-2690 V2 and 256GB ram.... I thought for SURE this would fix any NFS speed issues. I can throw an optane SLOG and it helps with performance numbers - but everything I see - with this setup I should EASILY max my 10gb EVEN with sync-enabled.

Code:
root@freenas[/mnt/SSDArray]# dd if=/dev/zero of=/mnt/SSDArray/NFS/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 7.286047 secs (2878312412 bytes/sec)
root@freenas[/mnt/SSDArray]#
root@freenas[/mnt/SSDArray]# dd of=/dev/null if=/mnt/SSDArray/NFS/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 2.548625 secs (8228563564 bytes/sec)
root@freenas[/mnt/SSDArray]#




d3700/firmware is out of the picture for now - never even got to benchmark those numbers as my first shelf/server/mobo combo doesn't make sense to me.
 
Last edited:

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
Rebuilt an ESXI host and that seems to have improved my performance.. There may have been some sort of issue with NFS 3 and NFS4.1 on freenas box. (I took a hunch and enabled, rebooted, disabled, rebooted, enabled 4.1, rebooted)

1556939864355.png

I would've thought that with 12vdev mirrored SSD I would have pegged write speeds..

With an Optane SLOG added to the vdev pool
1556939932637.png

These numbers finally look better - where they should be - Time to see what minimum equipment I need to max 10gb line speed for NFS. - I think my SMB numbers on spinning rust is due to only 64MB cache per driver, vs the 265MB some of the larger drives have..
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
So you have a 12 vdev pool with a theoretical write capability of ~6GB/s. With dd you get 2,7GB/s locally (assuming the dataset was not on sync=always).

Of these 2,7GB/s you get 770MB/s (blocksize 2MB) via nfs (which then is sync'ed speed) and with optane slog you get 1,07GB/s.
If you use that pool for esxi vms you will fall down to 64K blocksize which will leave you at 280 MB/s (o/c before in memory write caching).

It would be interesting to see what your speed is via iscsi or sync=disabled on the dataset (to see impact of networking on the 2.7GB/s local speed).
Also, there is still the question why you only get 2.7GB/s out of the theoretical maximum.

My observations indicate that ZFS in general (not only FreeNas) is not scaling so well on higher end; i.e.
the speed increase per added vdev diminishes with the total amount of vdevs in the pool until it barely makes sense to add another vdev to it.
 

Holt Andrei Tiberiu

Contributor
Joined
Jan 13, 2016
Messages
129
anny news?
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
anny news?
So I actually settled - I wasn't getting the higher performance that I think I should have after tuning some items here and there - so I just accepted what I had.

So I'm running 24x 400GB ssd with 12x vdev's mirrored with an additional 2nd tier of 12x 4TB spinners with 6 vdev's mirrored.

My knowledge wasn't really in depth enough, and didn't really have to much time to investigate further..

I found even with 3x SAS3 controllers onboard (Each 8 drives had own controller) - my DD numbers were 2700/8000 - Which were almost exactly the same numbers running the HP d3700 tray on a single external SAS3 card with 24x SSD in 12 mirrored vdev.

So at this point I've settled that I can max the 10gbe moving pictures and running my VM's off of(With Optane Slog). The spin pool will hold my plex data.

I'm disappointed with the performance - but I don't have the time or money to investigate much further. As of now I've moved everything off my QNAP, but have some troubleshooting left ( NFS 4 file permissions and the GPO issues)

Sold all my extra SSD drives, and have the D3700 tray and some of my other SM chassis/mobo's on ebay now. Trying to not tempt myself with g9/g10 with NVME drives to see what I get.
 

Holt Andrei Tiberiu

Contributor
Joined
Jan 13, 2016
Messages
129
Tahnks's for the info.

I had some speed limit's my self, and some wierd data error's with SATA drives and my shelf's Dell MD 1200 and MD 1220.
I have moved away from SATA and started using only SAS drives, gong SAS solved my problem.
Now i am reconfiguring my main box the 4'th time in 6 month's.
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
Just thew some more money at my system..(a small car's worth unfortunately) - This is with just 1 Mirrored Vdev with 2x HUSMR7619BDP3Y1 -
This is with NO slog - I've upgraded to a 40gbe interface (For this one ESXI NODE). - Ordered another SM NVME card - to have 2x Mirrored Vdev's -Curious why write seems to bottle neck 1gb - was same with optane 900 PCIE. (even over 40gbe)

1559860519448.png



HUSMR7619BDP3Y1 (Above) 1920GB
3350 MBps (read) / 2100 MBps (write)
4KB Random Write : 75000 IOPS
4KB Random Read : 835000 IOPS

Also just bought 20x PX04SVB160 (12Gbs/SAS3) 1600GB
1900 MBps (read) / 850 MBps (write)
4KB Random Write : 60000 IOPS
4KB Random Read : 270000 IOPS
(Spec's were a little to figure out - not even sure if these are right)



Some other interesting things - moving my 24 Intel s3700 400gb SSD from main system with 3x dedicated M1215 SAS3 cards to a single system JBOD with EL1 - connected to a SINGLE external SAS3 - numbers were about the same..
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
So how did it go with the new drives - did going SAS resolve the issue?
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
So how did it go with the new drives - did going SAS resolve the issue?

SAS did help speed the system up. (Considering the SAS drives almost doubled my read/write alone) - So I got myself 22TB super fast array
My system numbers seemed pretty good -
Code:
"Throughput report Y-axis is type of test X-axis is number of processes"
"Record size = 1024 kBytes "
"Output is in kBytes/sec"

"  Initial write " 11665157.62
"        Rewrite " 8288589.09
"           Read " 15753663.44
"        Re-read " 15640108.69
"    Random read " 15953570.19
"   Random write " 7414903.09


But what I now realize my goals/expectations were a bit off.

IE -
Super fast array does not mean tons of pictures(small files) transfer really that much faster. ( Cut myself off before dropping more money doing a total 40gbe infrastructure) (I got one of those $50 40gbe mellanox InfiniBand switches, never got around to use it)

My VM's load pretty fast now. lol.

Wanted to try dual 40gbe port across my 3 node esxi cluster to a multipath ISCSI target.. Bet I could get some great numbers with sync write off too..
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
Great, glad to hear that.
Thats 7GB/s random write at 1MB blocksize? Thats a huge jump ahead from the old results :O

Could you elaborate on the final setup a bit more?
How many threads/which qd is that? Locally or remote? Which /how many HBAs and all that:)
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
yeah, 1MB block -

SuperMicro X9DRH-iTF, 2x E5-2570 v0 , 256GB Ram
1x BPN-SAS3-216A-N4
2x 9400-16i
1x 9300-8e
1x Intel 560SFP+ 10gb
2x AOC-SLG3-2E4 (PCIE to NVME)
4x Hitachi HUSMR7619BDP3Y1 (NVME) 1920GB
20x Toshba PX04SVB160 SSD (12Gbs/SAS3) 1600GB

As for Threads and Queue Depth - not sure. - This server me realize how out of my depth I am - But it works for now. (And sorta out PCIE slots so..)
 

Attachments

  • RACK.jpg
    RACK.jpg
    232.5 KB · Views: 402

Rand

Guru
Joined
Dec 30, 2013
Messages
906
So how is the pool layout; which drives make up which pool (or l2arc/slog) ? (zpool status)
Are those local test results (which tool) or remote (which tool) - or where do the numbers come from:)

The 20 SAS drives are internally, 10 on each 9400-16i?
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
So how is the pool layout; which drives make up which pool (or l2arc/slog) ? (zpool status)
Are those local test results (which tool) or remote (which tool) - or where do the numbers come from:)

The 20 SAS drives are internally, 10 on each 9400-16i?

Actually you got me..

8x SAS on each each 9400 (not 10)
4x SAS on 9300

2x NVME on each AOC-SLG3-2E4

I wanted to gain a PCIE slot back - but I found that even though the 9400 are tri-mode, if you put NVME and SAS - it knocks the SAS to 6gb (from 12gb)

No Slog - though I had an optane PCIE initially (outta PCIE Slots)
No L2arc - because why? - Sure I've got 256gb Ram.. but I've got nothing that needs accelerating that isn't on my SAS pool.


The numbers were produced from iozone - Have pair of mirrored NVME for vmstorage - but still haven't spun it up yet.(Was wanting to see 40gb multipath ISCSI to it, but ran outta money and time) Went with Raidz3 to max space and rebuild speed shouldn't be that bad (Have a cold spare on hand). - Plus I've got this pool backed up to an offsite Storage server in a Colo.(Via openvpn tunnel - Fios 400/400) & Backed up to a cloud provider. I figured I'm ok risk wise for raidz3


Code:
root@flash[~]# zpool status
  pool: NVME
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:44 with 0 errors on Sun Jul 14 03:00:44 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        NVME                                            ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/76634228-888b-11e9-b7ad-002590eda6f4  ONLINE       0     0     0
            gptid/7d605123-888b-11e9-b7ad-002590eda6f4  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            gptid/30d4c73b-8bf2-11e9-a34b-002590eda6f4  ONLINE       0     0     0
            gptid/37be61da-8bf2-11e9-a34b-002590eda6f4  ONLINE       0     0     0

errors: No known data errors

  pool: SASSSD
 state: ONLINE
  scan: scrub repaired 0 in 0 days 01:22:40 with 0 errors on Sun Jul 28 01:22:40 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        SASSSD                                          ONLINE       0     0     0
          raidz3-0                                      ONLINE       0     0     0
            gptid/d6828222-927c-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/dd9d8c84-927c-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/e4ce2996-927c-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/ebf6fc55-927c-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/f306afdc-927c-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/fa244488-927c-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/01659edd-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/088d94ab-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/0faefa7a-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/16da5310-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/1e099e51-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/253283b7-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/2c5b3f5b-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/3388e91c-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/3adf5b7d-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/4231a0e7-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/498dce72-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/50d207fd-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/580eb21d-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0
            gptid/5f7d224a-927d-11e9-955c-002590eda6f4  ONLINE       0     0     0

errors: No known data errors
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
Ok, so you optimized for streaming writes which explains why your pics are not transferring faster (more random instead of streaming reads).

Are those 9400w-16i (x16 slot) or regular (x8 slot) ? And why a 9300-8e for the last 4 drives? Is that just what you had? Or a typo?
 
Last edited:

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
Ok, so you optimized for streaming writes which explains why your pics are not transferring faster (more random instead of streaming reads).

Are those 9400w-16i (x16 slot) or regular (x8 slot) ? And why a 9300-8e for the last 4 drives? Is that just what you had? Or a typo?

2x SAS 9400-16i - pcie3.1 8x each
9300-8i (What I had around) - for remaining 4x SAS since the 9400 can only do 8x each
9300-8e - Attach my external tray's (Currently empty) - future expansion.
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
Ah ok, you didnt mention the 9300-9i in the first post, thats why I wondered.

How did you run the test? linux/win with network drive? Cifs/SMB?

Any chance to run some further tests on that pool? (the atto you used above for example)
Also wonder how your nvme pool will do, had less than stellar success with 2x2 Optane 900p's ...

Any optimizations applied to freenas? 10G buffers , jumbo etc?
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
Because I just can't leave well enough alone..

Optimizations were just the standard freenas checkbox ones (auto)

Ordered some Cisco 40gbe BIDI optics - says they were compatible with my mellonox 40gbe - but nothing came up. ISCSI performance between windows 10 desktop and Freenas seemed much better than esxi (with sync on) - Got active 40gbe fiber coming next week to test more...

My DD numbers are similar between my 2 data pools (sas3 vs nvme) - a pair of dl380 G9 fell into my lap yesterday - wanting to test numbers on those with nvme (enablement kit - not sure if feasible or how many it can take - vs my 4x the SM chassis takes at the moment)

Ran some recent DD numbers -

(4x 1.92 NVME Mirrored) (4x 1.92 NVME Mirrored) 2x VDEV
Write 3.4GB/Sec
Read 7.03GB/Sec

(20x 1.6TB SAS3 RAIDZ3) 1 VDEV
Write 3.34GB/Sec
Read 7.14GB/Sec

Code:
WRITE (4x 1.92 NVME Mirrored) (4x 1.92 NVME Mirrored)

root@flash[/mnt/NVME]# dd if=/dev/zero of=/mnt/NVME/ddfile bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 6.119236 secs (3427146872 bytes/sec) 3.4GB/Sec
root@flash[/mnt/NVME]# dd if=/dev/zero of=/mnt/NVME/ddfile bs=2048k count=20000
20000+0 records in
20000+0 records out
41943040000 bytes transferred in 12.453960 secs (3367847735 bytes/sec) 3.26GB/Sec

READ (4x 1.92 NVME Mirrored) (4x 1.92 NVME Mirrored)

root@flash[/mnt/NVME]# dd if=/mnt/NVME/ddfile of=/dev/zero bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 2.982404 secs (7031750292 bytes/sec) 7.03GB/Sec
root@flash[/mnt/NVME]# dd if=/mnt/NVME/ddfile of=/dev/zero bs=2048k count=20000
20000+0 records in
20000+0 records out
41943040000 bytes transferred in 6.020859 secs (6966287966 bytes/sec) 6.96GB/Sec
root@flash[/mnt/NVME]#

--------------------------------------------------------------------------------

WRITE (20x 1.6TB SAS3 RAIDZ3) 1 VDEV

root@flash[/mnt/SASSSD]# dd if=/dev/zero of=/mnt/SASSSD/ddfile bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 6.267625 secs (3346007383 bytes/sec) 3.34GB/Sec
root@flash[/mnt/SASSSD]# dd if=/dev/zero of=/mnt/SASSSD/ddfile bs=2048k count=20000
20000+0 records in
20000+0 records out
41943040000 bytes transferred in 12.791088 secs (3279083146 bytes/sec) 3.27GB/Sec

READ (20x 1.6TB SAS3 RAIDZ3) 1 VDEV
root@flash[/mnt/SASSSD]# dd if=/mnt/SASSSD/ddfile of=/dev/zero bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 2.933611 secs (7148704093 bytes/sec) 7.14GB/Sec
root@flash[/mnt/SASSSD]# dd if=/mnt/SASSSD/ddfile of=/dev/zero bs=2048k count=20000
20000+0 records in
20000+0 records out
41943040000 bytes transferred in 5.851915 secs (7167403939 bytes/sec) 7.16GB/Sec



MIRROR Pair of NVME drives.
Still getting some odd numbers - Via ISCSI over 40Gbe (Sync Disabled)
1573276443249.png

Read's seem way to low..


And this is with NFS4 enabled on freenas (NFS3 wrote is ok, read is sub 600) (Sync Disabled)
1573276511925.png
 

Poached_Eggs

Dabbler
Joined
Sep 17, 2018
Messages
31
For Kicks I installed server 2019 bare metal on one of my (currently former) esxi 6.7u3 boxes. Windows ISCSI to freenas seems more in line where I thought I should be. Why not just go with NFS4? - Well.. esxi doesn't share the datastore nicely across hosts - so instead I get datastore, datastore(1), datastore(2) - and vmotion doesn't care for that (since they are technically different).. so whats the point then?

At this point - I'm thinking I'd rather sell the 2x Supermicro NVME and 4x NVME drives - as it seems my SAS3 array performance gives better mid to top range

SAS3 (20x 1.6tb SSD raidz3, 1Vdev)
1573316515311.png



NVME (4x 1.92tb SSD Raidz1 2Vdev)
1573316636811.png
 

Attachments

  • 1573315937991.png
    1573315937991.png
    79.9 KB · Views: 324
Top