BUILD All Flash (SSD) Hardware Recommendations

Status
Not open for further replies.
Joined
Mar 22, 2016
Messages
217
Hey all! Let me just say thank you in advance for any and all help received!

I'm contemplating building an all flash Freenas. I have read the threads about SSD based NAS' and saw the recommendations to go with an iXsystem, but this is not something mission critical, more experimental than anything. It will host 6 iSCSI targets, and a current movie projects (which are will be replicated on another Freenas system) With that said...

Motherboard: X10SRH-CNL4F
CPU: Xeon E5-1630V4
Case: SUPERMICRO 846E16-R1200B
Case: SuperChassis 216BE1C-R741JBOD (external shelf)
HBA: LSI-9207-8e
SSD: Samsung 850 EVO 500GB
RAM: 64GB Hynix HMA84GR7MFR4N-TF ECC RDIMM
SLOG: Intel 750 SSD (Looks like it is still needed)
BOOT: Mirrored San Disk Cruzer
NIC: Chelsio T580

So here are the questions:

  • RAM, how much do I really needed? There will be at most 24 SSD's in this build with at most 12TB's in storage space, but that won't happen for a while.
  • Pool configuration. I was initially going to go with a mirrored Vdev config, but since that would really kill space I was thinking about a RaidZ1 config. Either 4 Vdevs with 6 drives each, or 6 Vdevs with 4 Drives each. Throughput is king here. We work with large video files (sometimes in the TB range) and being able to move them quickly is key. IOPS are still a consideration since there will be some iSCSI targets hence the multi-Vdev configs.
  • HBA: I currently use a couple of LSI-9207, which has served me well in 3 Freenas builds. It has the bandwidth of 4x6Gb/s per connection. In an effort to maximize throughput would it be possible to use something like an LSI 9300-8e? I understand that my backplane is a single SAS2 connection and all my of SSD's are SATA3 which all have caps of 6Gb/s, but using a SAS3 HBA allow more aggregate bandwidth since the it has 4x12Gb/s connections? Or will this not all matter due to either the backplane, SSD's, cable or just because it doesn't work that way?
Thank you again!
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
RAM, how much do I really needed? There will be at most 24 SSD's in this build with at most 12TB's in storage space, but that won't happen for a while.
64GB seems like a decent starting point.

Pool configuration. I was initially going to go with a mirrored Vdev config, but since that would really kill space I was thinking about a RaidZ1 config. Either 4 Vdevs with 6 drives each, or 6 Vdevs with 4 Drives each. Throughput is king here. We work with large video files (sometimes in the TB range) and being able to move them quickly is key. IOPS are still a consideration since there will be some iSCSI targets hence the multi-Vdev configs.
Be sure to run real workloads with your proposed configuration, to get a feel for the performance you get. Count on IOPS dropping as the pool fills, so test that, too.

HBA: I currently use a couple of LSI-9207
Sure.

In an effort to maximize throughput would it be possible to use something like an LSI 9300-8e? I understand that my backplane is a single SAS2 connection and all my of SSD's are SATA3 which all have caps of 6Gb/s, but using a SAS3 HBA allow more aggregate bandwidth since the it has 4x12Gb/s connections? Or will this not all matter due to either the backplane, SSD's, cable or just because it doesn't work that way?
It would possibly help if you had an SAS3 backplane. With an SAS2 backplane, you're limited to 6Gb/s per lane.
 

diehard

Contributor
Joined
Mar 21, 2013
Messages
162
Firstly, i don't know if this setup is optimal for your scenario. If throughput is what matters and you are worried about space.. traditional spinning rust is still the way to go. The connection method being iSCSI doesn't necessarily change your IOPS needs (unless these are VM's, but you didn't really specify where the other end of the iSCSI connection is going). A traditional NL-SAS array could get what you need for less money.

You simply won't find many people that can contribute a lot of information about all-flash home built ZFS arrays, not many people are doing them. There have been reports of (take with some salt) that ZFS thrashes SSD's with metadata updates and fragmentation that could possibly severely effect their lifespan, and you using TLC based drives that does not bode well.

As these will be mostly video files i don't believe a large ARC or l2ARC will help much, so 64GB will probably be fine. As far as your HBA, i don't believe going with a 9300 will help overall speed at all with those backplanes, not to mention how much more tried and true the 9207 chipsets are.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
This looks interesting but I wonder if just having a quality slog and l2arc will give sufficient performance?
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
with at most 12TB's in storage space
If this is for iSCSI, then you need to plan on not using more than 50% (6TB) of the space. Probably better to stay below 40%...
Pool configuration. I was initially going to go with a mirrored Vdev config, but since that would really kill space I was thinking about a RaidZ1 config. Either 4 Vdevs with 6 drives each, or 6 Vdevs with 4 Drives each. Throughput is king here. We work with large video files (sometimes in the TB range) and being able to move them quickly is key. IOPS are still a consideration since there will be some iSCSI targets hence the multi-Vdev configs.
From iSCSI perspective, Mirrors are the only way to go IMHO. If IOPS are a concern, the real bang for the buck would be with Mirrors (since more vDevs), but as you noted space killer.
 
Joined
Feb 2, 2016
Messages
574
Throughput is king here. We work with large video files
For raw throughput and video editing, I'd dump RAIDZ1 in favor of striped mirrors. Six effective spindles with RAIDZ1 versus 12 effective spindles with mirrors is no contest. Same level of redundancy, too.

The Samsung 500 GB drives are going to cost $3,800. If this is just in-transit work space and regularly replicated, I'd go with larger, less-expensive SSDs. You can get 18 960 GB OCZ SSDs or 960 GB ADATA SSDs for nearly the same price as 24 500GB drives, you'll have the same amount of disk space and 50% more spindles which means more throughput even with with fewer drives.

With larger drives, you'll also have room to add capacity (and throughput) which you may need to keep your iSCSI utilization below the recommended 50% point.

While I don't trust either OCZ or ADATA as much as Samsung or Intel, we do use the ADATA drives in our FreeNAS production server and have for about a year. They host our XenServer VMs. When they burn out, we'll replace them with whatever SSD is big, cheap and fast enough.

I'm no a video expert, but I'm inclined to agree with @diehard that spinning rust may be fast enough. Twelve mirrors in a stripe is pretty darn fast. One-terabyte 10k drives are cheap, too.

Cheers,
Matt
 
Last edited by a moderator:
Joined
Mar 22, 2016
Messages
217
Thanks for the information all!

Noted about the backplane, looks like I'll be limited to 24Gb/s for the bandwidth, which might just be the nail in the coffin for this project at least with the hardware I have on hand.

As far as the SSD's go, I decided with the Samsungs since I already have 8 of them. Though I did look around. I think I'll also have to look into how hard ZFS is on all flash systems.

After the advice I think I'll forget the RaidZ config and go with Mirrored V-devs instead. Looks like there's no way around the performance benefits of mirrored V-devs. That would give me 6TB's of space, but would need to keep it at around 3Tb's to stay under 50%. That's roughly the size of 3 of our large projects which should be just perfect, for now at least.

Ultimately the goal is going to try to cut down on transfer times and to do this we are going to try 40gbe. The hope was to saturate that link to the best of our abilities. Trying to accomplish this on spinning drives would be a challenge, mostly because of the type of files that are being moved. ARC and L2ARC wouldn't seem to be able to help much in this instance, thus requiring a lot of spinning drives. We figured it might actually be cheaper to try the SSD route then the HDD route. To get the performance we wanted would have taken 60+ HDD's in a mirrored V-dev configs. An ungodly amount of storage space, but a lot of racks needed to do that.

We currently use the system in my sig which gets pretty close to saturating a 10Gb link. Which is still miles ahead of where we were previously on our 1Gb link.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I'm not 100% convinced that the 50% thing necessarily applies to an all flash array.

My understanding is its because as an array fills, it gets fragmented, but SSDs deal with fragmentation much better as seeking is essentially free.
 
Joined
Feb 2, 2016
Messages
574
Samsung recommends 10% SSD over provisioning (above the internal over provisioning) on individual drives for acceptable performance and longevity. That means FreeNAS should use no more than 90% of an SSD even before FreeNAS performance optimization is factored into the equation.

Seagate says SSD performance falls at 50% capacity.

Even a full SSD is going to perform faster than an empty mechanical drive so I don't spend too much time worrying about SSD performance under high utilization for our use cases. But, if I did, I'd keep my SSDs and my pools less than half full.

Cheers,
Matt
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm not 100% convinced that the 50% thing necessarily applies to an all flash array.

My understanding is its because as an array fills, it gets fragmented, but SSDs deal with fragmentation much better as seeking is essentially free.
The effect is going to be much smaller than with spinning rust, but SSDs also suffer. I'd like to see numbers for a variety of SSDs (consumer MLC, enterprise MLC, TLC, ...), since ZFS can be rather tricky to predict.
 
Joined
Mar 22, 2016
Messages
217
Hmm that's interesting.

I may embark on this quest simply to experiment and see what happens and let everyone know. Outside of iXsystem TrueNAS All Flash systems, there seems to be very limited information about freenas and all SSD's. I may have to expand my search to just include ZFS in general and see what else comes.

Looks like I have a lot of learning ahead of me though.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Samsung recommends 10% SSD over provisioning (above the internal over provisioning) on individual drives for acceptable performance and longevity. That means FreeNAS should use no more than 90% of an SSD even before FreeNAS performance optimization is factored into the equation.

Seagate says SSD performance falls at 50% capacity.

Even a full SSD is going to perform faster than an empty mechanical drive so I don't spend too much time worrying about SSD performance under high utilization for our use cases. But, if I did, I'd keep my SSDs and my pools less than half full.

Cheers,
Matt

That is a good link, but also serves to help prove my point, although performance does decrease at 50%, the decrease accelerates the fuller the drive.

Exactly similar to ZFS, and for roughly the same reasons, I guess you could say SSDs are Copy On Write.

And ZFS assists the SSD by coalescing writes, and not ever re-writing a block.

Anyway, my experience is that 30% OP results in very little performance degradation, and with the 8% factory OP +20% min free space on a NAS you are sitting bang on the 28% min OP used by enterprise SSD manufacturers. Those enterprise SSDs are designed to run at 100% full, after OP... For use in flash arrays... Or as caching drives ;)

Free space in your array counts as OP. 50% is overkill. The 20% limit is to prevent ZFS moving to its pathological block finding algorithm (basically ZFS tries harder to avoid fragmentation, by looking harder for a block, rather than taking the first it finds, once you pass a certain %ge of capacity utilization). Once you get to 80% full it's time to plan an expansion anyway.

AnandTech has very good SSD reviews, they go into worst case steady state random I/O testing at various levels of OP.

Add StorageReview and you pretty much cover everything.
 
Joined
Mar 22, 2016
Messages
217
Well I've expanded the search to just ZFS in general and there really is not a lot out there about it.

@jgreco had some really good posts in a thread recently about it. https://forums.freenas.org/index.php?threads/ssd-only-build-where-are-we-at.42427/page-2

What I got from it is, depending on how many writes there are to the system, odds are you'll probably be upgrading the SSDs for more space before you start replacing them due to being worn out. There was a lot of discussion about ZFS possibly thrashing SSDs but no one could say for sure because it's not done a lot.

Here's to experimenting!
 
Status
Not open for further replies.
Top