Benchmarked 15 x 15k SAS2 striped drives, only got 10 Gb/s max... best I can get?

digity

Contributor
Joined
Apr 24, 2016
Messages
156
I threw together a rig with FreeNAS (latest version) to benchmark a SAS 6Gb/s enclosure I just got, EMC KTN STL3, and the hard drives that it came with; 15 x 600 GB SAS 15k RPM - mixed sizes, brands and SAS versions (drive details below). The drives showed up as multipath disks, I threw all 15 drives in a stripe/RAID0 pool and disabled sync and compression. I used dd to do the benchmark (commands used below), because I couldn't I don't know how do make jails yet so I can use bonnie++. I averaged 1261 MB/s (9.9 Gb/s) for write speeds and about the same for read speeds. Does this sound about right?

I ask because before setting these drives up in FreeNAS, I tested individual drive performance first with ATTO Disk Benchmark (Windows) and got 200 MB/s for the 1 x 3.5" HGST HDD, averaged 222 MB/s for each of the 6 x 2.5" Toshiba HDDs and averaged 249 MB/s for each of the 8 x 2.5" HGST HDDs (for both read and write for all). That's 3524 MB/s (27.5 Gb/s) total and while I understand I won't get that exactly in stripe/RAID0 in FreeNAS I'd imagine I'd get way more than 1261 MB/s (9.9 Gb/s).

Server Config
Gigabyte GA-X79-UP4 mobo, 1 x XEON E5-2620, 8GB DDR3, LSI 9201-16e

SAS schema
LSI 9201-16e 6Gb/s SAS PCI-e x8 HBA (IT mode) in PCI-e x8 mobo expansion slot <-- 2 x SFF-8088 1 meter cable (for multipath SAS) --> connected to both controllers on the EMC KTN STL3 (the ports labeled with two circles)

Drives
  • 8 x 2.5" HGST SAS 12 Gb/s 15k RPM HDD
  • 6 x 2.5" Toshiba SAS 6Gb/s 15k RPM HDD
  • 1 x 3.5" HGST SAS 6Gb/s 15k RPM HDD
Command used (to make 20 GB dummy file)
Code:
dd if=/dev/zero of=test.dat bs=2048k count=10000
dd of=/dev/null if=test.dat bs=2048k count=10000
dd if=/dev/random of=test.dat bs=2048k count=10000



Am I getting the best performance here or can I tweak something to get better speeds?


P.S. - For the random tests I always get 89 MB/s (0.7 Gb/s), even for other tests with other types of disks and RAID sets (SSDs, SATA 7.2k RPM HDDs, etc.), so I didn't include those results as I believe something's wrong with how I'm implementing that test.
 
Joined
Feb 2, 2016
Messages
574
  • 8 x 2.5" HGST SAS 12 Gb/s 15k RPM HDD
  • 6 x 2.5" Toshiba SAS 6Gb/s 15k RPM HDD
  • 1 x 3.5" HGST SAS 6Gb/s 15k RPM HDD

When you put SAS3 drives in the same stripe as SAS2 drives, you're probably only going to get SAS2 speeds?

It's too late in the day for me to be doing math. Let's see if we can eyeball the bottleneck...

* Aggregated, your drives may do: 27.5 Gbps
* Reported throughput from strip: 9.9 Gbps
* Intel X79 Express Chipset: PCIe 2.0
* PCI Express 2.0 (per lane): 4 Gbps
* PCIe 2.0 x 8 lanes: 32 Gbps
* SAS2: 6 Gbps
* SFF-8088: 24 Gbps
* SFF-8088 x two cables: 48 Gbps
* XEON E5-2620 bus speed: 7.2 GTps (32 Gbps give or take)

Any idea what the backplane capacity is for the enclosure?

The limiting factor looks to be the eight lanes of PCIe 2.0 bus at 32 Gbps. Is the card really in an x8 slot? If it was in an x4 slot, that could reduce bandwidth to 16 Gbps which is close enough to 10Gbps to point the finger there. Are there any other PCIe devices fighting for bandwidth? Protocol overhead and theoretical versus reality can knock your numbers down, too.

Personally, I'd be happy with 10 Gbps but I'd be happier with 15 Gbps.

TL;DR: {shrugh} I don't know.

Cheers,
Matt
 

blueether

Patron
Joined
Aug 6, 2018
Messages
259
kia ora

I have just got exactly the same shelf, with 15k 600GB disks as well.
Did a few quick tests:
root@freenas[/mnt/DiskShelf]# zfs set sync=off DiskShelf
cannot set property for 'DiskShelf': 'sync' must be one of 'standard | always | disabled'
root@freenas[/mnt/DiskShelf]# zfs set sync=disabled DiskShelf
root@freenas[/mnt/DiskShelf]# dd if=/dev/zero of=test.dat bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 12.537511 secs (1672702045 bytes/sec)
root@freenas[/mnt/DiskShelf]# dd of=/dev/null if=test.dat bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 4.838074 secs (4334683455 bytes/sec)
root@freenas[/mnt/DiskShelf]# dd if=/dev/random of=test.dat bs=2048k count=10000
^C5931+0 records in
5931+0 records out
12438208512 bytes transferred in 161.856407 secs (76847181 bytes/sec)
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
kia ora

I have just got exactly the same shelf, with 15k 600GB disks as well.
Did a few quick tests:

Thanks for sharing! So you got 1672.70 MB/s (13.38 Gb/s) writes, 4334.68 MB/s (34.68 Gb/s) reads and 76.84 MB/s (0.61 Gb/s) for random.
Specs of your server (i.e., CPU, RAM, model of your motherboard, model of your HBA controller and it's PCI-e gen and speed and which slot it's in)? Are you using multipath? Also, is compression disabled for the dataset?
 

blueether

Patron
Joined
Aug 6, 2018
Messages
259
Hi
it will have standard compression, there is no data on it so can reconfigure the pool...

Set it up as you has it (stripe/sync disabled /compression off) and get much closer to your results (compression was the big one)
write just over 8Gb/s
read around 6 Gb/s

this is on a single link from a lsi 4i4e sas2
dual Xeon x5620 and 16 GB ddr3 ram, so old hardware.

Sorry for the duff data
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
Hi
it will have standard compression, there is no data on it so can reconfigure the pool...

Set it up as you has it (stripe/sync disabled /compression off) and get much closer to your results (compression was the big one)
write just over 8Gb/s
read around 6 Gb/s

this is on a single link from a lsi 4i4e sas2
dual Xeon x5620 and 16 GB ddr3 ram, so old hardware.

Sorry for the duff data

Heh, you actually just saved me some work - the final rig for this FreeNAS build will likely be a XEON x5500/5600 series CPU with 16 GB RAM.
 

blueether

Patron
Joined
Aug 6, 2018
Messages
259
Heh, you actually just saved me some work - the final rig for this FreeNAS build will likely be a XEON x5500/5600 series CPU with 16 GB RAM.
I have 2 IBM X3650 m3 one with 2 x5660? and 96GB ram and the second for testing with 2 x5620 and 48GB ram (running freenas as a VM or 2) this is where the EMC KTN STL3 is attached
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
When you put SAS3 drives in the same stripe as SAS2 drives, you're probably only going to get SAS2 speeds?

It's too late in the day for me to be doing math. Let's see if we can eyeball the bottleneck...

* Aggregated, your drives may do: 27.5 Gbps
* Reported throughput from strip: 9.9 Gbps
* Intel X79 Express Chipset: PCIe 2.0
* PCI Express 2.0 (per lane): 4 Gbps
* PCIe 2.0 x 8 lanes: 32 Gbps
* SAS2: 6 Gbps
* SFF-8088: 24 Gbps
* SFF-8088 x two cables: 48 Gbps
* XEON E5-2620 bus speed: 7.2 GTps (32 Gbps give or take)

Any idea what the backplane capacity is for the enclosure?

The limiting factor looks to be the eight lanes of PCIe 2.0 bus at 32 Gbps. Is the card really in an x8 slot? If it was in an x4 slot, that could reduce bandwidth to 16 Gbps which is close enough to 10Gbps to point the finger there. Are there any other PCIe devices fighting for bandwidth? Protocol overhead and theoretical versus reality can knock your numbers down, too.

Personally, I'd be happy with 10 Gbps but I'd be happier with 15 Gbps.

TL;DR: {shrugh} I don't know.

Cheers,
Matt

Well, the PCIe slots are listed as 3.0 by the manufacturer, but I see Intel lists PCIe slots for the X79 chipset as 2.0... not sure which is true. As for other PCIe devices, there's a PCIe 3.0 x1 video card displaying the FreeNAS console and a 10 GbE NIC that's not doing anything.

I don't know the capacity, but the backplane is fibre channel and the drives connect to it via an interposer board built into the drive caddies.
 

nap

Cadet
iXsystems
Joined
Dec 12, 2017
Messages
9
You can validate the PCIe version and link width using pciconf -lvc on FreeNAS, for example:
# pciconf -lvc mpr0
mpr0@pci0:3:0:0: class=0x010700 card=0x30e01000 chip=0x00971000 rev=0x02 hdr=0x00
vendor = 'Broadcom / LSI'
device = 'SAS3008 PCI-Express Fusion-MPT SAS-3'
class = mass storage
subclass = SAS
cap 01[50] = powerspec 3 supports D0 D1 D2 D3 current D0
cap 10[68] = PCI-Express 2 endpoint max data 256(4096) FLR NS
link x8(x8) speed 8.0(8.0)
cap 05[a8] = MSI supports 1 message, 64 bit, vector masks
cap 11[c0] = MSI-X supports 96 messages, enabled
Table in map 0x14[0xe000], PBA in map 0x14[0xf000]
ecap 0001[100] = AER 2 0 fatal 0 non-fatal 0 corrected
ecap 0019[1e0] = PCIe Sec 1 lane errors 0
ecap 0004[1c0] = Power Budgeting 1
ecap 0016[190] = DPA 1
ecap 000e[148] = ARI 1
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
Okay, I got "link x8(x8) speed 5.0(5.0)" which confirms this PCIe 2.0 x8 HBA card is operating at it's full potential, but does the specs in the parenthesis indicate the max possible spec for the card or the slot? If the former, how do I find the max possible spec for the slot (if I want to verify the manufacturer's claim)?


Code:
root@freenas[~]# pciconf -lvc mps0
mps0@pci0:3:0:0:        class=0x010700 card=0x30c01000 chip=0x00641000 rev=0x02 hdr=0x00
    vendor     = 'LSI Logic / Symbios Logic'
    device     = 'SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor]'
    class      = mass storage
    subclass   = SAS
    cap 01[50] = powerspec 3  supports D0 D1 D2 D3  current D0
    cap 10[68] = PCI-Express 2 endpoint max data 256(4096) FLR NS
                 link x8(x8) speed 5.0(5.0) ASPM disabled(L0s)
    cap 03[d0] = VPD
    cap 05[a8] = MSI supports 1 message, 64 bit
    cap 11[c0] = MSI-X supports 15 messages, enabled
                 Table in map 0x14[0x2000], PBA in map 0x14[0x3800]
    ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
    ecap 0004[138] = Power Budgeting 1
    ecap 0010[150] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI disabled
                     0 VFs configured out of 7 supported
                     First VF RID Offset 0x0001, VF RID Stride 0x0001
                     VF Device ID 0x0064
                     Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304
    ecap 000e[190] = ARI 1
 

nap

Cadet
iXsystems
Joined
Dec 12, 2017
Messages
9
I forget which value in which place means what at the moment, but one is what the card is capable of, one is what the card managed to negotiate. For example, I have a T580 network card with physical damage that only manages to negotiate x1 lanes even though it should support x8, so it shows up as either x8(x1) or x1(x8), I forget which order it is, though.

If you look at the manual for that motherboard (not sure which rev you have), if you're only using x8 cards you're probably fine with any slot that is electrically x8 or x16. They nicely do provide the block diagram on the gigabyte site and describe the bandwidth sharing in the specifications page.

Spec page: https://www.gigabyte.com/Motherboard/GA-X79-UP4-rev-10/sp#sp
Manual (see page 8 for block diagram): https://download.gigabyte.com/FileList/Manual/mb_manual_ga-x79-up4_e.pdf
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
I picked up a LSI 9207-8e PCIe 3.0 x8 HBA controller card, confirmed in FreeNAS it's operating at PCIe 3.0 x8 (link x8(x8) speed 8.0(8.0)) and I got essentially the same benchmark results as the LSI 920-16e PCIe 2.0 x8 HBA controller card(9.3, 9.3 & 0.7 read, write & random respectively).

Does this rule out the PCIe 2.0 card as the culprit? I just brought home a Dell PowerVault MD 1220 disk shelf... should I bother throwing the SAS drives in that and benchmarking again or is 9 to 10 Gbps the best I can get regardless of the enclosure?


P.S. - the final FreeNAS build will likely be a XEON E5-1240 v2 based server mobo with 24 GB RAM, if that matters (it has 2 PCIe 3.0 x8 slots).
 
Top