What is needed to achieve 10Gbps read with RAIDZ-2 (raid6)

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
The Evo drives don't hold up as well. If you are going to do this, you should spend the extra money and go with the Pro drives.
Think of it as an investment.
 

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
I have been searching for availability of parts and pricing, and have arrived at the following build. I have managed to find space to fit in 2U and am concerned with the noise of the fans in a 1U chassis to be too loud so am going to 2U now.

Supermicro chassis (used at around $58USD although I don't have the part number)
Supermicro X11SAE (apparently the LGA 1151 boards are rated up to 50 degree celcius vs 35 degree celcius for most of their boards)
- Xeon E3-1235L v5 (25W TDP so helps to keep things cool)
- 16GB DDR4 ECC 2400 x 4 = 64GB total (max memory acceptable by the Skylake processors)
- LSI 9300-8i HBA
- Samsung SSD 860 QVO 1TB x 12 = 10TB data in RAIDZ2
- Mellanox MNPA19-XTR 10Gbps card

I know it has been suggested to use the 860 PRO disks because of the durability but this is really just for me to store my family photos and some personal content so I doubt I'll exceed the rated 360 TBW limit of these drives. And with RAIDZ2 I should have sufficient redundancy and time to get replacement drives.

Any comments? Do I need to do anything with the LSI 9300-8i HBA such as using a specific firmware?
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
No. This is not what you want. It has an audio chipset and no IPMI. If you are going to buy a Supermicro board, you want an actual server board.
Take a look at this guide created by one of the other moderators and please choose a different board:
https://www.ixsystems.com/community/resources/so-you’ve-decided-to-buy-a-supermicro-x11-xeon-e-coffee-lake-board.107/
Yeah I was thinking about how these motherboards are awefully similar to desktop PCs rather than servers. Anyway I found a really awesome deal. I placed an order for the following.

DL360 G8
Xeon E5-2630L v2 [60W TDP]
16GB DDR3 ECC REG 1600 x 8 [128GB total]
Mellanox MNPA19-XTR [1 x 10Gbps]
LSI 9300-8i HBA

I haven't decided how many disks I'll start off with. Ideally I'd like to start off with 6 x 1TB in 1 vdev RADIZ2 so I get 4TB of usable space, then expand by filling in the remaining 4 extra bays over time. Unfortunately my understanding is this is not possible with ZFS / FreeNAS at the moment. Is this correct?

If I work around this limitation by filling in all 12 disks I'd have 10TB of space which is way too much and is going to be generating a lot of unnecessary heat.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I haven't decided how many disks I'll start off with. Ideally I'd like to start off with 6 x 1TB in 1 vdev RADIZ2 so I get 4TB of usable space, then expand by filling in the remaining 4 extra bays over time. Unfortunately my understanding is this is not possible with ZFS / FreeNAS at the moment. Is this correct?
Yes.
If I work around this limitation by filling in all 12 disks I'd have 10TB of space which is way too much and is going to be generating a lot of unnecessary heat.
If you are talking of using SSDs, there will be near to no heat generated. Heat comes from the mechanical disks. SSDs do generate some heat, but it is so little in comparison that you might as well think of them as running cold. I wouldn't concern myself with having extra room.
Your math doesn't add up though. In one paragraph you talk of six drives and then four extra bays, where the next paragraph you speak of twelve drives. It would be six and six in two vdevs either way, just six to start and add six more later. I am confused by the missing two bays.
 

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
Your math doesn't add up though. In one paragraph you talk of six drives and then four extra bays, where the next paragraph you speak of twelve drives. It would be six and six in two vdevs either way, just six to start and add six more later. I am confused by the missing two bays.
Ah yikes I meant 6 now and 4 later. Forgot it's a 1U which can only fit 10 SFFs rather the 12 SFFs on a 2U I was almost going to order. Would 6+4 be not as good as 5+5?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Would 6+4 be not as good as 5+5?
It would be much better if they were equal in size rather than having one that is 6 and the other is 4. Also, if we're talking SSDs, I think you can do RAIDz1 instead of RAIDz2 because SSDs are much less likely to fail.
JUST BE SURE TO USE A GOOD-QUALITY SSD. You get what you pay for.
 

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
Okay another twist to this but I saw the following about running FreeNAS on ESXi so that gets me thinking of trying to do the same.
https://www.ixsystems.com/blog/yes-you-can-virtualize-freenas/

I didn't think of running ESXi at home but I have been running ESXi on a dedicated server in a data center for some production stuff so having one at home to host dev VMs wouldn't be a bad idea.

As you know initially I was thinking of doing 2 vdevs. This means each vdev with 5 disks in RADIZ2 mode which gets me 6TB of data storage if I use 1TB disks or 12TB of data storage if I use 2TB disks. But if I am to run FreeNAS as a VM inside ESXi, I need a boot disk so following is my latest thought...
Disks 1 - 2: RAID0 for ESXi boot disk & datastore
Disks 3 - 10: Single vdev in RADIZ2 which will get me the same 6TB of data storage if I use 1TB disks.

Sounds great as I get the same amount of data storage using the same size of disks, except I'm now using only a single vdev which will have half the IO but that shouldn't matter as I'm planning to use SSDs anyway.

The big question now is how am I going to get only 2 of the 10 disks in RAID0 and the remaining 8 to run in pass through mode? I don't think this is possible as there are only 2 SAS expanders on the DL360 and both of which needs to be connected to the LSI 9300-8i HBA. I guess the only remaining options are:
  1. Use a SATADOM.
    This is attractive but I think there is no room to plug in a pair and run in RAID0. In fact I don't think this is possible even with a single SATADOM with where the ports for the onboard p420i. Page 63 of http://h20628.www2.hp.com/km-ext/kmcsdirect/emr_na-c03242811-9.pdf shows they are at a very strange angle close to the edge of the enclosure.
  2. Use a USB disk connected to an internal USB port on the motherboard.
    To be honest I'm not too comfortable of the reliability of USB.
  3. Remove the SAS expander board and do the following:
  4. Replace the 4 x 1Gbps FlexibleLOM card with a 2 x 10Gbps FlexibleLOM like the 530FLR, and free up one PCIe slot for a NVMe.
    Shame as I bought a Mellanox 10Gbps card already and still I have no RAID0 for the boot disk.
It'd be between 3 and 4 I think. 3 sounds feasible on paper without having to buy another network card and put the Mellanox to spare but somehow I have a feeling taking out the SAS expander backplane for the disks is going to screw something up. 4 would give me another 2 disks for storage and more sure to work. Thoughts?
 

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
I bought a FlexibleLOM with 2 x 10Gbps so I can free up the PCIe slot where the Mallanox was going to plug in a NVMe or a M2 adaptor but it appears G8 doesn't support using this as a boot disk despite you can access the disk via PCIe post boot.

I am going to make use of the onboard P410i and connect one of the ports to 2 x M2 via a SFF 8643 to 4 x SATA cable and run the pair as RAID0 boot disks. I will find a place to glue or screw on a pair of these slim M2 to SATA adaptor.
https://www.amazon.com/QNINE-Adapte...AQC0RMY02CK&psc=1&refRID=TESS63Q4QAQC0RMY02CK

This is someone else doing something similar with a full 2.5" disk.
https://serverfault.com/a/937955

The DL360G8 has an optical drive connector on the motherboard, and I think for the 10 SFF backplane there is also a regular molex power connector. Availability of a molex power connector would make it easier to get power to the M2 to SATA adaptor.
 
Last edited:

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
The motherboard of the DL360 has a slimline SATA connector for the optical drive so I went to a local computer shopping center yesterday and bought a slimline to regular SATA adapter, and some cabling to give me a pair of power connections for the boot disks, which I was planning to hook up to the internal p420i and stick it somewhere internally.

Just so happened to bump into this which gets me thinking about removing the SAS expander backplane and connect the HBAs direct to disks again.
https://www.orangecomputers.com/node/?command=kb&docid=43

Given the server is still sitting at the warehouse I was thinking about possibility of removing the SAS expander. If I do this I can do the following:
  1. Use a pair of SFF 8643 to 4 x SATA cable to connect 8 drives in total. Then run an 8 wide RAIDZ/2 vdev.
  2. Use the remaining 2 bays to host the pair of boot disks.
This is in contrast with the original plan to use all 10 bays to support 2 x 5 wide RAIDZ/2 vdev which with 1TB disks provides 10TB raw / 6TB usable. The 8 wide RAIDZ/2 vdev with 1TB disks provides the same usable storage with 8TB raw / 6TB usable at a cost of higher resilver times but I have to buy 2 less disks.

In terms of resilver times ... here's what I'm thinking.
Disk Size: 1TB = 1000GB
SAS Speed: 6Gbps = 0.75GBps
Time To Read Whole Disk: 1333.333 seconds = 0.37037 hours

For resilvering an 8 wide RAIDZ/2 vdev when 1 disk is lost, we have to read 7 disks in total -> 7 x 0.37037 hours = 2.59 hours.
For resilvering a 5 wide RAIDZ/2 vdev when 1 disk is lost, we have to read 4 disks in total -> 4 x 0.37037 hours = 1.48 hours.

Actual time should be quicker because the LSI 9300-8i HBA I'm using has 8 lanes of 12Gbps so running it on 8 x 6Gbps disks should be more than plenty.

@Chris Moore can you comment on the resilvering time please?
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Bypass the 6Gbps limitation. After all I do have a 12Gbps HBA.
Edit, the 6Gb limitation is in the drive, not the controller the drive is connected to. If you are using conventional SATA SSD drives, there is no point in even having a 12Gb SAS controller because the drives are too slow to take advantage of it. I think I tried to explain that to you already.
You would need 12Gb SAS SSD drives if you want to connect them to your SAS controller and actually see 12Gb speeds.
Obviously, there are other options, but here is one to look at:
https://lenovopress.com/lp0568-12gb-sas-enterprise-performance-ssds

Speed usually equates to money. How much money can you afford to spend?

PS. Those Lenovo branded drives are probably made by HGST.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
This is someone else doing something similar with a full 2.5" disk.
https://serverfault.com/a/937955
That server chassis is not the most ideal environment for your plan. I have internal bays for my boot drives like this in one chassis:

1554578188232.png

and in my other system it has rear mounted hot-swap bays like this:

1554578255111.png
from: https://www.ixsystems.com/community/threads/my-new-48-bay-build.61223/

Other chassis have space internally to mount boot drives but that HP chassis is not well designed for that option because that is not how they intended it to be used. That is the reason I like to avoid the big companies like Dell and HP, not because it is bad hardware but because they have a particular way they intended it to be used which is usually not compatible with the way we need to use the hardware for FreeNAS.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Just so happened to bump into this which gets me thinking about removing the SAS expander backplane and connect the HBAs direct to disks again.
If you wanted to change the cabling and try to get 12Gb to the drive, you would need drives like this:
https://www.ebay.com/itm/For-Dell-Toshiba-SAS-1-6TB-2-5-SSD-solid-state-drive-12Gb-s-NEW/273792797266
Even then, the drive is still not as fast as the interface. I think you have a desire for more speed than can be had unless you are willing to spend a lot more money. That is just a random drive I pointed to, to give you a sense of the cost involved, but you have options. I would search eBay for "12Gb SAS SSD" and pick some used ones because they were most likely replaced in favor of larger drives and still have a lot of life remaining. You save a lot of cash buying used in this space, kind of like cars.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
In terms of resilver times ... here's what I'm thinking.
Disk Size: 1TB = 1000GB
SAS Speed: 6Gbps = 0.75GBps
Time To Read Whole Disk: 1333.333 seconds = 0.37037 hours

For resilvering an 8 wide RAIDZ/2 vdev when 1 disk is lost, we have to read 7 disks in total -> 7 x 0.37037 hours = 2.59 hours.
For resilvering a 5 wide RAIDZ/2 vdev when 1 disk is lost, we have to read 4 disks in total -> 4 x 0.37037 hours = 1.48 hours.

Actual time should be quicker because the LSI 9300-8i HBA I'm using has 8 lanes of 12Gbps so running it on 8 x 6Gbps disks should be more than plenty.

@Chris Moore can you comment on the resilvering time please?
Resilver time will depend on the speed that the new disk can write data. The kind of HBA is not relevant unless you actually have 12Gb SAS SSDs because even SAS spinning disks that are rated for a 12Gb interface are not going to be fast enough to matter. What kind of drive are you going to use.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
- Samsung SSD 860 QVO 1TB x 12 = 10TB data in RAIDZ2
These drives are cheap, cheap drives. They are only rated for a maximum sequential read of 550 MB/s and maximum sequential write of 520 MB/s which is slower than 6Gb SAS and less than half the speed of 12Gb SAS and these drives will not approach that speed in random operations. They are not data-center quality drives and may not hold up well under the workload that ZFS will place on them. You would be better served with used data-center grade drives such as these:
https://www.ebay.com/itm/HGST-Ultrastar-1TB-HUSMR1610ASS201-SAS-SSD-P-N-0B32235-12Gb-s-F-W-B2C0/192879613594
but if you want to save a buck or two, you could go with the SATA data-center grade drives like these:
https://www.ebay.com/itm/New-HP-Intel-S3520-Series-960GB-2-5-SATA-6Gb-s-MLC-Internal-SSD-SSDSC2BB960G7P/113655149392
or you could go to a slightly larger drive like this:
https://www.ebay.com/itm/New-HP-Intel-DC-S3610-Series-1-6TB-2-5-inch-7mm-SATA-III-MLC-6-0Gb-s-SSD/113687480914
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
It should tell you something when the Samsung EVO, previously the 'value' series drive from Samsung is $147 and the new QVO series drive from Samsung (in the same size) is $117. The Samsung Pro series drive in the same size is $297, almost twice the price of the EVO. When I made my suggestion of what drive to use, I am sure I said Pro series, but I could be remembering incorrectly and I am not going back to check. Still, the point is, I would not trust the QVO series drive in an array. I probably wouldn't trust the EVO series drive in an array. Before I bought either of those, if I were budget constrained, I would buy the used Intel DC drives from eBay. They are much more likely to survive and then you don't need to worry about resilver time. I had four of the HGST SAS SSDs in a pool for over a year before they were upgraded to something with more capacity and never had a moment of trouble from them.
 
Joined
May 10, 2017
Messages
838
These drives are cheap, cheap drives.

They are cheap and slow, these are QLC SSDs, and after the small SLC cache is full they can only sustain 80MB/s writes, so slower than a modern HDD, and not really recommended unless the OP only care about read preformance, even the 860 EVO is many times faster than these for writes, as the 1TB model can sustain around 500MB/s.
 

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
titusc said:
Bypass the 6Gbps limitation. After all I do have a 12Gbps HBA.
Ouch you are quick. I thought I edited that out. I wasn't thinking about SAS disks because they are expensive and totally not worth it. I was actually thinking about getting 8 x M2 NVMe drives and get M2 adapters but then I realize whether it is a M2 to SATA or M2 to U2 adapter it just won't work. The M2 to SATA adapter is still limited at 6Gbps and the M2 to U2 although would work this means I can only connect two disks as the HBA is only 8 lanes and each U2 is already 4 lanes. I could get a pair of HBAs each with 4 x SFF8643 ports (16 lanes). Between a pair of these will support 8 x M2 NVMe disks but I decided not to pursue because I'd need to order the cards and wait for another week or two. Besides the only advantage with these faster connections will be resilvering time because I am limited by 10Gbps connections from PC to NAS. I have been thinking about 40Gbps because I can get a Mellanox IB switch at about $500 USD but I need to find a way to lay new fiber. When we renovated the place I didn't think I will need anything beyond 10Gbps so went with Cat 6 UTP throughout. That was a mistake and it is somewhat late now. I guess I can if I really want to but I don't want to expand scope otherwise I will never finish with the project.

I also want to use the other PCIe slot to add a NVMe SLOG as I want all synchronous writes because I want to serve some space for VMs. Yes I am installing ESXi and then FreeNAS as a VM with the LSI HBA pass through.

In terms of chassis yeah I didn't want to go to HP. I am a sucker for Supermicro but I happened to find such a good deal with the HP I bought it'd be silly to let it pass. You are right about them having to work in a specific way.

For the drives you did comment about the QVO being cheap but I thought they are rated at 300TBW for the 1TB disks. I honestly can't imagine being able to write that much data. This is mostly used for storing family photos so mostly read. Each RAW is about 25MB so 1TB is 40,000 photos. A 8 wide RAIDZ2 is 6TB usable. I think I will struggle to fill that up so by that measure even 1TBW endurance is good enough so 300TBW is like a lot already. There maybe something I am not aware of so correct me if this thought process on endurance required is not right.
 

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
They are cheap and slow, these are QLC SSDs, and after the small SLC cache is full they can only sustain 80MB/s writes, so slower than a modern HDD, and not really recommended unless the OP only care about read preformance, even the 860 EVO is many times faster than these for writes, as the 1TB model can sustain around 500MB/s.
Does this matter if I use a 500GB NVMe SLOG? I thought QVO and EVO have similar read and write speeds and differ mostly on endurance.
 

titusc

Dabbler
Joined
Feb 20, 2014
Messages
37
When I made my suggestion of what drive to use, I am sure I said Pro series, but I could be remembering incorrectly and I am not going back to check. Still, the point is, I would not trust the QVO series drive in an array. I probably wouldn't trust the EVO series drive in an array.
Yes you said that but I thought about my use case of storing only photos that is mostly read only (explained 2 posts above). Again please correct me if my line of thought explained above is flawed.
 
Top