Hitachi SAS SSDs as SLOG and other SSD questions

Status
Not open for further replies.

Sirius

Dabbler
Joined
Mar 1, 2018
Messages
41
I have a ZFS server with 2 x mirrored/striped (aka RAID10) pools.

One pool is 8 x 3TB 7200rpm Toshiba drives and holds an iSCSI volume. The other pool is 12 x 4TB drives and is used for SMB shares. (I put a zpool status dump at the bottom of the post)

Currently I'm using 2 x HUSSL4010BSS600 as my SLOG devices. Each drive has 2 x 10gb partitions, and then each partition is assigned to the appropriate pool. As a result both pools have access to a 10gb partition on each SSD to use as a SLOG.

I'm planning to upgrade to 10gigE asap, and my concern is that the HGST drives will be a bottleneck for sync writes. I have a single Dell PERC H200 controller (flashed to IT mode) as the only HBA for all the drives.

So my initial questions are as follows...
1. Would the 2 x HGST drives bottleneck on a 10gigE network?
2. Would adding an additional 2 x HGST drives and maybe putting them on a separate HBA help?
3. Assuming having 4 SLOG drives won't help, what's the next step? I'm considering using the Optane 800p. This is a home system so the write loads aren't too intense so the lifespan should be decent.

One possible option I'm considering is using a QNAP QM2-4P-384 (they work fine in a regular PC) and loading it up with 4 x Optane 800p 58gb drives. This would allow me to mix and match the Optanes as SLOG or L2ARC to my hearts content while only using a single PCIe slot. Also, as far as I can tell it works out cheaper than any other option (I think...)

Anyway... I'd appreciate any advice people can provide! Thank you :)

Here's the zpool status of my current pools:
Code:
  pool: iscsizpool
 state: ONLINE
  scan: scrub repaired 0 in 0 days 02:43:28 with 0 errors on Sun Jul 22 02:43:28 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		iscsizpool									  ONLINE	   0	 0	 0
		  mirror-0									  ONLINE	   0	 0	 0
			gptid/a1be2608-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
			gptid/a4bb11af-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
		  mirror-1									  ONLINE	   0	 0	 0
			gptid/a7b5889c-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
			gptid/aa99e2dc-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
		  mirror-2									  ONLINE	   0	 0	 0
			gptid/ad789ffc-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
			gptid/b07707c9-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
		  mirror-3									  ONLINE	   0	 0	 0
			gptid/b385731d-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
			gptid/b6834739-6d6d-11e8-88ef-bc5ff457ff46  ONLINE	   0	 0	 0
		logs
		  mirror-4									  ONLINE	   0	 0	 0
			da20p2									  ONLINE	   0	 0	 0
			da21p2									  ONLINE	   0	 0	 0

errors: No known data errors

  pool: mainzpool
 state: ONLINE
  scan: scrub repaired 0 in 0 days 13:04:52 with 0 errors on Sun Aug 12 13:04:53 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		mainzpool									   ONLINE	   0	 0	 0
		  mirror-0									  ONLINE	   0	 0	 0
			gptid/bfac677c-0366-11e8-9403-d850e6c2dc19  ONLINE	   0	 0	 0
			gptid/c0c26f20-0366-11e8-9403-d850e6c2dc19  ONLINE	   0	 0	 0
		  mirror-1									  ONLINE	   0	 0	 0
			gptid/64b27c2c-0388-11e8-9c5b-d850e6c2dc19  ONLINE	   0	 0	 0
			gptid/65f3d003-0388-11e8-9c5b-d850e6c2dc19  ONLINE	   0	 0	 0
		  mirror-2									  ONLINE	   0	 0	 0
			gptid/2c8df379-0389-11e8-9c5b-d850e6c2dc19  ONLINE	   0	 0	 0
			gptid/2d659037-0389-11e8-9c5b-d850e6c2dc19  ONLINE	   0	 0	 0
		  mirror-3									  ONLINE	   0	 0	 0
			gptid/4d203a23-0389-11e8-9c5b-d850e6c2dc19  ONLINE	   0	 0	 0
			gptid/4e21bbb5-0389-11e8-9c5b-d850e6c2dc19  ONLINE	   0	 0	 0
		  mirror-4									  ONLINE	   0	 0	 0
			gptid/24be0f82-1d31-11e8-b705-d850e6c2dc19  ONLINE	   0	 0	 0
			gptid/25df158d-1d31-11e8-b705-d850e6c2dc19  ONLINE	   0	 0	 0
		  mirror-5									  ONLINE	   0	 0	 0
			gptid/87c25849-1e6a-11e8-81c2-d850e6c2dc19  ONLINE	   0	 0	 0
			gptid/89cd92c0-1e6a-11e8-81c2-d850e6c2dc19  ONLINE	   0	 0	 0
		logs
		  mirror-6									  ONLINE	   0	 0	 0
			da20p1									  ONLINE	   0	 0	 0
			da21p1									  ONLINE	   0	 0	 0

errors: No known data errors
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
A whole thread about testing various devices for use as SLOG devices and a graph:

That graph specifically is comparing a few users with the Optane 900p drives as SLOG devices. Here's a direct link to the post where Chris Moore benchmarks the exact HGST SLC SSD you're using:

https://forums.freenas.org/index.ph...-and-finding-the-best-slog.63521/#post-455075

So for your questions:

1. Would the 2 x HGST drives bottleneck on a 10gigE network?
2. Would adding an additional 2 x HGST drives and maybe putting them on a separate HBA help?
3. Assuming having 4 SLOG drives won't help, what's the next step? I'm considering using the Optane 800p. This is a home system so the write loads aren't too intense so the lifespan should be decent.

1. Yes, absolutely. At the small block sizes (16K and below) they can't even saturate 1GbE.
2. Having another mirror of them would get you past that threshold at 16K but nowhere near 10GbE. The 8x6Gbps links to your expander backplane shouldn't be a bottleneck, but an additional HBA might be considered if you plan to add a lot of additional SSDs.
3. You could buy faster SATA/SAS devices, but for 10GbE performance you really want to get off that bus and onto PCIe, so the M.2 NVMe adapter card with Optane is your current front-runner. Optane is a lot like the old "direct-write-to-NAND" from the old Pliant Lightning SSDs so it should be safe - the DC P4800X is listed as having PLP but obviously at enormous added cost. Edit: See footnote and Intel's lame-duck response, although with their ARK ECC mix-up years ago, I'm still wary on their claims.

Counter-question: Does your SMB workload really require sync writes? You could be causing unnecessary contention there on those devices. If not, remove it.

* Optane by design is DRAM-less so it shouldn't need power-loss-protection, even down to the 16GB "Optane Memory" products. That said, Intel's claimed differentiation is that the DC P4800X has "circuit checks on the power loss system" - I have no idea what that actually means, and in all likelihood the Optane 900p is perfectly fit for purpose. (The DC P4800X also has 4X the total TBW rating.)
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Joined
Dec 29, 2014
Messages
1,135
I actually have four of the HGST SAS SSDs that I tested in my system as part of my iSCSI pool. I do notice the drop in performance on small block writes. I have not installed it yet, but I already have, and I am planning to install a Intel SSD DC P3700 in my system to replace the HGST drives.

That is ~=$300 more than I paid for my Optane 900P. I am just curious what you think the advantage of this one is over the 900P.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
That is ~=$300 more than I paid for my Optane 900P. I am just curious what you think the advantage of this one is over the 900P.

I'm guessing he didn't pay anywhere near the MSRP for it.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
That is ~=$300 more than I paid for my Optane 900P. I am just curious what you think the advantage of this one is over the 900P.
It is supposed to have better support for power loss, but the reason I have it is because I got a seriously good deal on it. If I understood the benchmark results, the Optane drive is faster, so it is all about the price in this situation.
 

Sirius

Dabbler
Joined
Mar 1, 2018
Messages
41
Thank you for answering a lot of my questions everyone, I now have some new thoughts/concerns.

Unfortunately I live in Australia so 2nd hand enterprise gear is a pain to find or shipping costs a fortune. A single Optane 900P 280GB costs AUD $519. Also by using a single device I have no redundancy, but multiple 800P 58GB would provide redundancy. Also it's my understanding a single device shouldn't be used as a SLOG and/or L2ARC for multiple pools so it'd be a waste to have a 900P 280GB as a SLOG for a single pool in my situation, IMHO.

As to whether sync writes are required, for one pool yes (the 12 drive SMB pool) as I value maximum integrity in that pool (hence my move to ECC and xeon in the future). Although for the other pool... Probably not. If it corrupts I could re-download/re-install everything but that's extremely painful due to my 6mbit/0.5mbit ADSL2.

In terms of backups, I have an LTO-4 tape drive (probably upgraded to LTO-5 if I can find a cheap one) and a BDXL burner. But restoring from those is also not a fun experience.
 
Last edited:

Sirius

Dabbler
Joined
Mar 1, 2018
Messages
41
One other idea I've been considering but it has weaknesses (ie. latency) is using a hardware RAID card with 2 or 4 of the HGST SSDs in hardware RAID. I've noticed LSI HW raid cards with a BBU and 1gb cache aren't too bad price wise.

Would such a setup be able to handle 10gigE or would it still be too slow?
 
Joined
Dec 29, 2014
Messages
1,135
One other idea I've been considering but it has weaknesses (ie. latency) is using a hardware RAID card with 2 or 4 of the HGST SSDs in hardware RAID. I've noticed LSI HW raid cards with a BBU and 1gb cache aren't too bad price wise.

Hardware RAID cards and FreeNAS are not a good combination. They don't let ZFS do the job for which it was intended, and they really aren't supported. I just migrated away from a using a 9271 in JBOD mode to a 9207. Everything mostly worked, but SMART polls on the JBOD drives weren't able to do discovery, and thus were not being polled. At least in the US, those can be had for ~=$50 on eBay. That is cheaper than almost any of the decent RAID cards.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Would such a setup be able to handle 10gigE or would it still be too slow?

10GbE would blow out the 1GB BBWC in a second and then you'd be choked by the speed of the SAS drives again, with the added downside of having hardware RAID in play. While I've been meaning to experiment with this on a 1GbE setup (abusing BBWC cache) it's not a good match for 10GbE.

Shame about the second-hand cost, but I understand how it is with shipping things to AUS. And with that ADSL speed (and likely bandwidth caps?) I'd wager I could probably hop a flight with a few HDDs of data in my carry-on, and deliver your data both cheaper and faster. ;)

I'd go with the PCIe NVMe adapter and a pair of Optane 800p 58GB cards to start with. Split and partition them as you did the HGST SSDs. They should be able to handle 300MB/s or so of sync-writes at the lower block sizes (4K) scaling up to about double that at 128K - of course, the shared load between pools means that you may have overhead reducing that.

Oh, and when you get them, but before you add them as SLOG, do us a favor and run the benchmark from the SLOG thread we've linked above. We'd appreciate the data points. :)
 

Sirius

Dabbler
Joined
Mar 1, 2018
Messages
41
10GbE would blow out the 1GB BBWC in a second and then you'd be choked by the speed of the SAS drives again, with the added downside of having hardware RAID in play. While I've been meaning to experiment with this on a 1GbE setup (abusing BBWC cache) it's not a good match for 10GbE.

Shame about the second-hand cost, but I understand how it is with shipping things to AUS. And with that ADSL speed (and likely bandwidth caps?) I'd wager I could probably hop a flight with a few HDDs of data in my carry-on, and deliver your data both cheaper and faster. ;)

I'd go with the PCIe NVMe adapter and a pair of Optane 800p 58GB cards to start with. Split and partition them as you did the HGST SSDs. They should be able to handle 300MB/s or so of sync-writes at the lower block sizes (4K) scaling up to about double that at 128K - of course, the shared load between pools means that you may have overhead reducing that.

Oh, and when you get them, but before you add them as SLOG, do us a favor and run the benchmark from the SLOG thread we've linked above. We'd appreciate the data points. :)

Thankfully we don't have data caps, although at one point our ADSL1 (before we 'upgraded' to ADSL2) was capped at 3gb a month (!) so... need to look on the sorta bright sides I guess???

That's a good point with just getting 2 Optane drives to start with... the advantage of going with the M.2 carrier is I have that option and if something better comes out (enterprise M.2 NVMe with PLP? or newer Optanes?) I can just swap them in.

Next problem is finding non-fake 10gig ethernet adapters. I was going to place an order with Natex but they seemed to have sold out. The non-dodgy looking eBay options are ~$150 AUD each :( It's annoying as I have a decent (but noisy) HP ProCurve enterprise switch just sitting here until I get the 10gig cards.

And no worries with putting the scores in the thread, I shall do that!

Hardware RAID cards and FreeNAS are not a good combination. They don't let ZFS do the job for which it was intended, and they really aren't supported. I just migrated away from a using a 9271 in JBOD mode to a 9207. Everything mostly worked, but SMART polls on the JBOD drives weren't able to do discovery, and thus were not being polled. At least in the US, those can be had for ~=$50 on eBay. That is cheaper than almost any of the decent RAID cards.

I didn't really mean putting data drives behind the controller, I meant using a HW RAID card just for the SLOG to abuse the HW RAID's cache. Although as HoneyBadger has pointed out the 10gigE would rapidly fill the 1gig cache so it's a pointless idea anyway.

My current HBA is a Perc H200 flashed to the generic LSI firmware, seems to be working well so far.


So... thanks everyone, I really appreciate it! Seems like a QNAP QM2-4P-384 and 2 x Optane 800p 58gb are on my 'to buy' list for next month (hopefully)
 
Status
Not open for further replies.
Top