ZRAID for 50+ TB Storage

Status
Not open for further replies.

helenskelter

Cadet
Joined
Feb 8, 2015
Messages
4
Hello FreeNAS Community,

I am considering switching to FreeNAS as my primary storage. I am concerned about the viability of ZFS ZRAID for large volumes. Up to this point, I have relied entirely on hardware RAID, so while I understand the principles of ZFS and ZRAID, I have no hands on experience.

Currently, I have a 24 bay SAS enclosure. It is half full of 12 WE 4TB RE drives in RAID 6 with a hot spare (36TB). I am using an Areca Arc-1882x (LSI chipset). The Areca allows on-line expansion and runs scheduled parity/corruption checks. The system runs file sharing for VMs which then distribute services to users, all over 1GBE.

Over the next year or so, I plan to expand to 2 24 bay SAS enclosures, adding sets of 12 drives as needed. I would like to upgrade to 10GBE and primarily use ISCSI to distribute storage to VMs.

I would like to use the purchase of the 2nd SAS enclosure as an opportunity to migrate to FreeNAS. My primary concern is performance/cost. The ARC-1882x can effectively saturate a 10GBE line (with and without write cache) and can do so without my spending $$$ on additional RAM. ZRAID's 1GB RAM to 1TB storage ratio becomes an issue at my next 3 expansion points (72TB, 108TB, 144TB), assuming that ratio is still valid at those capacities.

So my alternatives, as I see them are to use FreeNAS ZFS on top of the hardware RAID, adding 36TB RAID 6 volumes as needed for expansion, or set the RAID card to passthrough/JBOD and add disk sets of 12 in ZRAID3 as needed, with an eventual 128GB of RAM. So my question is: what does the community advise in this situation? Any experience/pointers/critiques?

A secondary question: if I do go with the hardware RAID, how would ZFS handle a volume that increases in size? I can buy drives in sets of 12 but it's nice to add them incrementally, building the additional RAID6 arrays in chunks of 2-4 drives.

Thanks in advance for time spent reading and considering my situation.

All the best,
Helen
 
Last edited:

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
You shouldn't do hardware raid with ZFS. Zfs acts as the fs and volume manager. Use the money you save by using simple HBA to buy more ram.

Read stickies and cyberjock's zfs guide for more info.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm like 99% sure that controller is NOT LSI. In fact, I believe I worked with someone that tried to use that controller and lost all of their data (they happened to have no backups at the time of the failure).

If you are planning to use that controller (and *especially* if you plan to use any kind of hardware RAID) you should give up on ZFS and never look back. Merging ZFS and hardware RAID is a big freakin' mess and I *will* ignore your cries for help later (and you definitely will be crying later).

I think you should read our stickies and my noobie guide. That's a good place to start.

The RAM to disk space thumbrule is not a hard and fast rule. Sometimes you need more, sometimes you need MUCH more, sometimes you need less. It depends on your workload and required performance. But if you plan to go with 144TB of disk space the thumbrule is there so you should expect to use something around 144GB of RAM. That's all. In short, don't expect to put 32GB of RAM on your FreeNAS box and expect it to perform.
 

helenskelter

Cadet
Joined
Feb 8, 2015
Messages
4
Anados: Thanks for your reply. This is somewhat sad news, but there are plenty of alternatives to FreeNAS, although they are not as elegant out of the box.

Unfortunately, since I already own the RAID card, there is no opportunity to save money. If I were starting from scratch, I might have opted for a simple HBA instead.

----

Cyberjock: Areca puts a custom firmware on top of an LSI chip, generally achieving higher levels of control, performance, etc. Here is a review of the internal version of the 1882x: http://www.tweaktown.com/reviews/4670/areca_arc_1882i_raid_controller_review/index12.html

The Areca has kept my storage live at a functional i/o through drive failures, raid migration/expansion and regular parity checks. From my research, they are highly regarded. My experience has been overwhelmingly positive.

----

The prospect of ZFS appears very expensive to me in my case. I believe my server hardware has a memory limit of 128GB, so it would require an entire system upgrade in addition to $1000+ of RAM.

But before I abandon FreeNAS altogether, I'd like to ask one further question. UFS is mentioned here and there in the forums. It is usually derided, but would it be a viable option to accommodate the hardware RAID? I am really interested in FreeNAS for the out-of-the-box smb/nfs/afp/iscsi support along with ldap/kerberos authentication. Of the zfs perks, the only one I'd really make use of is the compression.


Also, just out of curiosity, have either of you (or anyone here) had any experience with FreeNAS arrays around or over 100TB? And is anyone seeing performance in the 800-1000MB/s range on non-SSD arrays (on large file copy)?

----

Thanks again for your time and consideration
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
UFS is a deadly sin in large sizes. fsck'ing a UFS partition that is more than a few GB has been known to take absurdly long periods of time. Very large servers have been known to have downtime measured in days because of the size of the file system.

I'm still much too hesitant to recommend that card on FreeNAS. If its not in use you could try doing smart tests like "smartctl -a /dev/diskid" or "smartctl -t short /dev/diskid". Note that if you have to add any Areca stuff (I had to when I did some experimenting with my Areca) that's a death keel for SMART on FreeNAS, and is one of the primary reasons why RAID cards are a fail for FreeNAS.

I was a major proponent of Areca before I went to ZFS (I had to buy an HBA when I switched to ZFS because I almost lost my pool because of Areca, even in jbod mode. I thought I was beating the system with some custom scripts to monitor SMART. I was wrong.). I've worked with some 1880 cards, and those do not support SMART at all, no matter what I tried.

On my old pool I was capable of saturating my 10GbE LAN for short bursts. My system wasn't designed for that kind of throughput, I just got lucky, and CIFS really limits you to about 450MB/sec tops for a single client anyway. It's totally possible to hit 1GB/sec. It's all about how you build the system. Some of the TrueNAS servers you can buy have quad 10GbE, and it is very possible to get that kind of throughput with a properly designed system.
 

helenskelter

Cadet
Joined
Feb 8, 2015
Messages
4
Cyberjock: Thank you so much for your response. Very helpful information.

I am convinced that I should absolutely not mix ZFS and the Areca. And obviously, UFS is not an option. So now I am torn between migrating to FreeNAS and ditching/repurposing the Areca or using a linux/iscsi solution. I'd prefer to have smb/nfs/afp in place so as to migrate to iscsi incrementally (and for easier administration), but in the end, I'll be distributing storage to VM's via iscsi and letting them take care of the other protocols on an individual basis.

Question: Are there more detailed hardware specs on the TrueNAS systems than seen here: http://www.ixsystems.com/static/downloads/pdf/TrueNAS_2_0_Datasheet_20141006.pdf

The RAM to storage capacity listed there gives me some hope. Only one model exceeds my limit of 128GB of memory. It's not clear to me how a single system supports so much storage though. Is that through a single server with expansion units or distributed across nodes? It would be helpful to see what processors are in use. I really am drawn to ZFS/FreeNAS but the cost might just be too prohibitive.

I believe one of the differences between FreeNAS and TrueNAS is a mirrored OS drive. Is that a hardware solution, or ZFS that provides the mirroring?

And one last thing: how power hungry is FreeNAS? It seems like all that memory/processing power would pull a lot more than a raid card on a mid-range server, but maybe the difference would be inconsequential compared to the power consumed by the drives?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Question: Are there more detailed hardware specs on the TrueNAS systems than seen here: http://www.ixsystems.com/static/downloads/pdf/TrueNAS_2_0_Datasheet_20141006.pdf

Not really. When you call and do a quote with iXsystems they ask a bunch of questions and pick a build that most aligns with your needs and expectations. Some users demand the lowest-end build (even if it may not be in their best interest) but that's life.

CPU power is not a general limitation with FreeNAS and TrueNAS. It doesn't take lots of power to end up with a bottleneck elsewhere (such as the disks or the network connections).

The RAM to storage capacity listed there gives me some hope. Only one model exceeds my limit of 128GB of memory. It's not clear to me how a single system supports so much storage though. Is that through a single server with expansion units or distributed across nodes? It would be helpful to see what processors are in use. I really am drawn to ZFS/FreeNAS but the cost might just be too prohibitive.

It's on a single server. Keep in mind that building vertically versus horizontally isn't always the best. Sometimes a few smaller servers are better than 1 extremely large server.

I believe one of the differences between FreeNAS and TrueNAS is a mirrored OS drive. Is that a hardware solution, or ZFS that provides the mirroring?

Not true. Both provided redundancy for the OS drive.

And one last thing: how power hungry is FreeNAS? It seems like all that memory/processing power would pull a lot more than a raid card on a mid-range server, but maybe the difference would be inconsequential compared to the power consumed by the drives?

If you are choosing proper hardware, you'll find idle power is very low. My machine, with no drives, idled at 34w. I know my Areca draws 10w idle, and you'll still have to deal with the CPU waking up, even in hardware RAID. If you are silly and build a box with FB-DIMMs, those draw something like 8w per stick. It's just about knowing what you should and shouldn't do. Besides that, a difference of 20w is not going to be appreciable. If you are going to argue that you are unhappy because FreeNAS would use 20w more than some hardware RAID solution, then you should go to something like a Synology. There's a reason why those things are extremely low power. They're custom silicon and are designed for their task. My old FreeNAS system with 24 drives spinning 24x7 used something like 200w idle and something like 300w when scrubbing the zpool. I think that is very reasonable in the big scheme of things.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The RAM to storage capacity listed there gives me some hope. Only one model exceeds my limit of 128GB of memory. It's not clear to me how a single system supports so much storage though. Is that through a single server with expansion units or distributed across nodes?

There's some discussion of how SAS expanders and shelves are used in my SAS-sy sticky; feel free to ask questions because what is obvious to me is not necessarily obvious to you. I use questions I've failed to address as a guide to improve my stickies. A single FreeNAS host could have a lot of drives attached, though there are probably practical limits.

The rules for memory sizing of a FreeNAS system are less defined as you increase RAM. The 1GB-per-TB rule is not a guarantee, just something that helps people grasp that they're not going to want to put a 32TB pool on an 8GB machine. An 8TB pool might still need more than 8GB, especially for something like VM storage. An 80TB pool might be supportable on a 32GB system if performance isn't a significant problem and utilization isn't heavy. In the end, it is a matter of analyzing system performance and figuring out whether your resources are sufficient.

iSCSI block storage is stressy on any CoW filesystem like ZFS and (at least with ZFS) presses the memory requirements higher if you want good performance. You want to maintain a large pool of free space in the ZFS pool (50% free or more!) and more RAM helps tremendously.

I've been working with a test filer here that'll be deployed for VM storage later this year. E5-1650, 64GB RAM, 256GB L2ARC, 4 x 2.5" 1TB WD Red drives in 2 x mirrored stripe. With data held in ARC, I can get 600MByte/sec read throughput to a single VM without much effort or tuning. Speeds drop dramatically for pool I/O, of course, since the test pool is very small. But the other thing is that I want to keep fragmentation low as time goes on, and our requirement of not losing redundancy for a single disk failure means three-way mirrors. We're using a 2U 24 2.5" drive server, so we're limited to seven mirrors (drives 22-24 are warm spares), and with the largest 2.5" drive being 2TB (which I'm expecting to use for production), that's 14TB total, and since I'm speccing a 50% hard cap on the pool, that means 48TB of drives only delivers about 7TB of usable space. It ought to be capable of some real awesomeness for a spindle-based filer though. I kinda expect flash prices to drop sufficiently that we might be able to go all-SSD in a year or three.

The real question is how much storage do you need to deliver to the initiators, what sort of traffic will it be, what's the working set size like, etc. For us, here, for example, the filer is going to be aimed at excellent read performance and the ability to sustain a modest amount of heavy write performance for a limited number of VM's, since our VM environment is designed to avoid heavy writes. The design I suggested above clearly aims at that sort of usage pattern, with massive amounts of pool read capacity.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I will say that I haven't seen too many TrueNAS boxes with >96GB of RAM. They are out there, but they are a small part of the servers that are "in the field".
 

helenskelter

Cadet
Joined
Feb 8, 2015
Messages
4
Thank you all for your helpful responses!

After considering all of this information, I think I will stick with Linux/Hardware RAID for the time being. I will try to get LIO iSCSI working in testbed and use that to deliver block via 10GBE to esxi VM's.

The cost of RAM and 50% storage overhead is too high in my case, and there are too many unknowns in regard to performance.

That said, I would like to revisit FreeNAS/ZFS for a future system, probably an offsite archive/backup where performance is lower priority. Once I become acquainted with it, I might choose to use it on my next production system.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Yes, ZFS can be a bit painful for block storage, but once you get "over the hump" all of a sudden magic happens. The Linux with hardware RAID won't have all the nifty features but it'll have a lower barrier to entry and somewhat less hard-to-predict performance.

The thing about ZFS is the magic that can happen once you throw the appropriate resources at it. What other filesystem would give you 600MBytes/sec of random traffic with what is essentially two disks (again, that was for data already in ARC, but still) ... but I certainly appreciate how annoying it is to have to cough up those resources. To conventional UNIX guys, it feels kind of obscene.
 
Status
Not open for further replies.
Top