100TB array on 32gigs ram - will it bottleneck?

Status
Not open for further replies.

mediahound

Dabbler
Joined
Mar 11, 2013
Messages
15
I'm the guy who posted in introductions who needs to eventually build a 200tb-1petabyte array. (unique data, before mirroring/backups) I'm not a sysadmin and am a total newbie who might even be using terms wrong but I keep reading and taking notes trying to understand.

I'm currently exploring hardware options trying to get a better understanding of different possible ways to do it because all that storage does not need to be on a single NAS unit (that I can tell/unless I misunderstand anyways/see my intro post for more info) and i'm trying to understand exactly how far a person can reasonably stretch a single NAS box in terms of ultimate storage. For comparison i'm speccing out boxes that max at 32gigs, 64gigs and 128gigs just to compare with a guesstimated per-drive-head cost of about $40-60 added on the server end to whatever each drive costs. (USB or SATA) Although i'm not any cheaper than Backblaze storage pods, theyre running RAID6 (which I dont trust due to silent data corruption) whereas i'm primarily interested in ZFS to prevent exactly that, which means building around it's extreme RAM requirements and such by design. The primary bottleneck therefore seems to be - how much RAM you can practically stick on the motherboard so far.

I am assuming (correct me if i'm wrong) that you can network together multiple NAS boxes (and essentially sum their RAM together) to get the total storage and performance needed? Ie - if nothing else, three boxes, with 32gigs ram each, and 32terabytes apiece should perform well and be treated as a single volume if properly configured? Or do I have this completely wrong? :P If i'm right it eases some of the issue, since costs-per-connected-drive go up noticibly when you have to use 16gig modules and server class motherboards. 32gig boards are consumer level and the i3's low enough wattage that running several isn't that big of a problem to get the desired number of supported drives.

I've often seen a rule of thumb of 1gig ram recommended per terabyte of drive space, I can fully understand that applying at lower levels but i'm wondering if it still holds just as true when you start maxing out motherboard ram slots? Are there any real world examples to go buy of people with 32gigs or more of RAM, and total drive storage in excess of 32 terabytes to act as rules of thumb?

How directly does the RAM-to-drivespace requirement affect performance, and what does affecting performance mean? If I have a somewhat slower sustained bandwidth (say 15MB/sec typical constant access) is that the issue, or is it more of an IOPS limitation, and does performance degrade linearily lacking RAM or does it collapse in some kind of catastrophic uselessness? (like drive thrash dropping performance to 1%) Or is the part that suffers the necessary background housekeeping (like scrubbing) which ZFS normally needs to do at some minimum expected performance level?

Is there any way to offload some of the RAM requirement onto SSD (in the same way deduplication tables can be offloaded to SSD, note I dont understand what it's doing with all that RAM either) to more readily deal with pretty darn large arrays on a single NAS?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I am assuming (correct me if i'm wrong) that you can network together multiple NAS boxes (and essentially sum their RAM together) to get the total storage and performance needed? Ie - if nothing else, three boxes, with 32gigs ram each, and 32terabytes apiece should perform well and be treated as a single volume if properly configured? Or do I have this completely wrong? :P If i'm right it eases some of the issue, since costs-per-connected-drive go up noticibly when you have to use 16gig modules and server class motherboards. 32gig boards are consumer level and the i3's low enough wattage that running several isn't that big of a problem to get the desired number of supported drives.

I've often seen a rule of thumb of 1gig ram recommended per terabyte of drive space, I can fully understand that applying at lower levels but i'm wondering if it still holds just as true when you start maxing out motherboard ram slots? Are there any real world examples to go buy of people with 32gigs or more of RAM, and total drive storage in excess of 32 terabytes to act as rules of thumb?

How directly does the RAM-to-drivespace requirement affect performance, and what does affecting performance mean? If I have a somewhat slower sustained bandwidth (say 15MB/sec typical constant access) is that the issue, or is it more of an IOPS limitation, and does performance degrade linearily lacking RAM or does it collapse in some kind of catastrophic uselessness? (like drive thrash dropping performance to 1%) Or is the part that suffers the necessary background housekeeping (like scrubbing) which ZFS normally needs to do at some minimum expected performance level?

Is there any way to offload some of the RAM requirement onto SSD (in the same way deduplication tables can be offloaded to SSD, note I dont understand what it's doing with all that RAM either) to more readily deal with pretty darn large arrays on a single NAS?

The only solid answer you can have to your questions is your last question. You can use SSDs as L2ARCs, but those actually use RAM. But it's not a dumb cache, it has to have been previously used recently as cache.

The rest of your questions are answered with "kinda", "sorta", "sometimes" etc. Those answers depend on your actual hardware, how its used, and how its configured. Almost all of your choices depend on many other factors like typical usage load, how the data is consumed, etc. That's why in the other thread I ended it with "you really need to get an IT guy that knows FreeNAS/FreeBSD to help you". The questions aren't answered with solid "yes/no". What could be a very solid "yes" in one situation can be a very very strong "no" in another.

That's also why I said that with very very large systems you have very unique challenges that requires an admin to identify and overcome.

The thumbrules are only guidelines. You may be able to get by with less(or much much less), but you also may need more(or alot more). It's a good starting point if you have no idea, but shouldn't be blatantly ignored. I needed 20GB of RAM for my 32TB zpool to get decent performance(12GB of RAM wasn't cutting the mustard). The scales may be linear, or may be exponential. Again, depends on how the data is consumed, your hardware, and other factors.

Your questions are best left to someone that knows the in depth knowledge of how your data will be consumed, what kind of network connection it will have, etc. Guesswork on a system of your size is hazardous at best.

I will tell you that if you plan to build a 100TB array you shouldn't try anything smaller than 32GB, but I'd buy the hardware fully expecting you'd need nothing less than 64GB. You might find you need more than 64GB.. but a system admin will have to make that call.

There are no solid guarantees in big-disk-world. Choices, but not guarantees. Guarantees come when a system is built and things aren't working as expected or a good admin that has sat down with you and discussed all the specifics and can predict your needs.

Edit: It is quite possible that the number of people that have even dealt with zpools bigger than 40-50TB on this forum is less than 3 people, and perhaps less than 1.
 

mediahound

Dabbler
Joined
Mar 11, 2013
Messages
15
To go in reverse order on the above comment: It is quite possible that the number of people that have even dealt with zpools bigger than 40-50TB on this forum is less than 3 people, and perhaps less than 1.

Well that makes alot more sense then. Part of my reasons for the questions are "has anyone else even DONE this??", if it's such a tiny handful of people possibly nobody then I guess i'm entering into unknown territory. I was hoping maybe there would be notes from some Solaris enterprise-class solution but maybe it doesn't translate over well, hmm...

It is tolerable to go up more incrementally rather than exponentially. To be honest most of the datasets that I might want to be on a single archive are unlikely to exceed 24-32TB anyways, there's one set that might hit 50-60TB. Even breaking up a dataset is not impossible, i've already done it with arrays of 3tb external drives under windows, it's just an increasing PITA to have to do so. A substantial reduction in "sysadmin" duties is by itself worth implementing. The migration to a single monolithic ZFS need not happen all at once nor overnight, it was more wondering who had pushed the limits in the past assuming there must have been some business customer to do so. :P


One reason I haven't nailed down too many specifics of hardware is that thats still up in the air, other than wanting big storage on the cheap there aren't alot of criteria, I wanted to be deliberatley vague so that people could suggest a few ways to do it if possible. My rough outlines I gave on the other subforum - wanting 15MB/sec of constant never decreasing data, on a 1gigabit ethernet link to start, with large enough data-piece-sizes that it shouldn't have a problem, it's not like swamping a RAID array with 2k files, more like a few megabytes at a time normally.

The rest of your questions are answered with "kinda", "sorta", "sometimes" etc. Those answers depend on your actual hardware, how its used, and how its configured. Almost all of your choices depend on many other factors like typical usage load, how the data is consumed, etc. That's why in the other thread I ended it with "you really need to get an IT guy that knows FreeNAS/FreeBSD to help you".

How bout just trying to be in the upper tier for now, something others are doing without needing a full on sysadmin to push the frontiers... is 32TB on a 32gig ram system pretty much a known quantity? If anyone has done a 40-50TB array i'd love to read about how they set it up, it's more important that I start on something that actually IS startable. I'll then just explore expanding it a bit and see what happens since I consider it an exploratory system anyways... or maybe i'll load it with less RAM like 16gig on a 32gig-capable board, and try a 24TB array, then try to expand it to see how far I get before problems show up. (unless someone else has done that to see)

That said i'm still trying to understand whether I can just set up a single monolithic Zpool across two physical machines (two motherboards each with connected drives) networked together, even on a lower level. I thought I had read somewhere (awhile back) about someone doing exactly that unless they were using some additional software to somehow concantate(sp?) NAS boxes.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well that makes alot more sense then. Part of my reasons for the questions are "has anyone else even DONE this??", if it's such a tiny handful of people possibly nobody then I guess i'm entering into unknown territory. I was hoping maybe there would be notes from some Solaris enterprise-class solution but maybe it doesn't translate over well, hmm...

I'm sure there are around via Google. But Oracle shutdown all the Sun websites they could get their hands on because Oracles only makes money selling you solutions, not providing you with knowledge to do it yourself. This is one of many reasons why I said you need to get an IT guy to help you. If you aren't working in IT you are months away from having any kind of knowledge close to resembling having an understanding of what you want/need and what you can/can't do.

It is tolerable to go up more incrementally rather than exponentially. To be honest most of the datasets that I might want to be on a single archive are unlikely to exceed 24-32TB anyways, there's one set that might hit 50-60TB. Even breaking up a dataset is not impossible, i've already done it with arrays of 3tb external drives under windows, it's just an increasing PITA to have to do so. A substantial reduction in "sysadmin" duties is by itself worth implementing. The migration to a single monolithic ZFS need not happen all at once nor overnight, it was more wondering who had pushed the limits in the past assuming there must have been some business customer to do so. :P

More than likely, the people that have the experience you want don't peruse the forums much because they know what they need to know for their job and they don't plan on helping newbies out a whole lot. They make great money and they are happy to keep that knowledge to themselves. Not to sound repetitive but this is another reason why I said you need to find an IT guy to contract with.

That said i'm still trying to understand whether I can just set up a single monolithic Zpool across two physical machines (two motherboards each with connected drives) networked together, even on a lower level. I thought I had read somewhere (awhile back) about someone doing exactly that unless they were using some additional software to somehow concantate(sp?) NAS boxes.

No feature I know of. But I can think of a few ways you could cheat the system. But I'm not about to discuss them because:

1. I'm not sure I'd recommend them for your configuration, let alone any configuration.
2. Trying to explain the risks involved with someone that isn't in the business of IT won't understand anyway.
3. They may not even work or cause unforseen data loss. I don't want to be the person that gave you the bad idea.

Yes, I keep saying you need an IT person, and I'm really not sure how many threads you are going to create before you realize you are in FAR over your head. I've been in IT for 20 years and when I went to FreeBSD I was so lost at first it wasn't funny. Being that I had a month sitting at home all day I spent 90% of my awake time just reading manuals, guides, and forum posts before I even built my first FreeNAS machine. Considering how many "IT experts" have had complete data loss because of a mistake they made months ago and weren't aware of at the time, I'm beyond hesitant to try to provide any support.

If IT experts can experience a complete loss of data with no chance of recovery what can I expect from someone that has already said they don't want to be the system admin and they aren't an IT person? You really can't just read the FreeNAS manual and know all you need to know. It's far more complex than that. You're basically wanting to be a car mechanic, but don't want to spend years getting the necessary experience to understand what you are getting yourself into.

You should just stick to what you know until you can afford to pay someone to do it right the first time and not the second time(after you've lost data permanently). But, judging from all the threads I see you've started, I'm not expecting you to listen to the advice provided since you haven't yet....
 

mediahound

Dabbler
Joined
Mar 11, 2013
Messages
15
I know it's starting to sound like a broken record but it's more that I think i'm not being understood that has me re-ask from a different angle certain things. I don't understand what fundamental difference it is that is occurring. As near as I can tell:

One Zpool 8TB = TOTALLY SAFE! Fine for "normal people" to create and sysadmin themselves.
One Zpool 16TB = TOTALLY SAFE! Fine for "normal people" to create and sysadmin themselves.
One Zpool 32TB = TOTALLY SAFE! Fine for "normal people" to create and sysadmin themselves.
One Zpool 33TB = You need a 100k/year expert!
One Zpool 32TB = Not okay anymore because you asked about the forbidden size, you need a 100k/year expert.
One Zpool 16TB = Not okay anymore because you asked about the forbidden size, you need a 100k/year expert.
One Zpool 8TB = Nope, too late, you said something you shouldn't, you wont be recommended to build anything at all now.

or just:

Two Zpool machines 17TB each totaling 34TB = You still need a 100k/year expert even if normal people can run a single 17TB system themself because that's over the allowable amount of storage on one premesis.

I'm sorry to sound frustrated but either ZFS and FreeNAS works with less data corruption and sysadmining than Windows 7 or it doesnt. It either works well up to some barrier or it doesn't. If I can't do "beyond the frontier" then I just want to know what I can do that's within the known so i'm hoping this is reasonable. I don't see how changing what I was asking about still merits the same response, especially after I realize that perhaps the vagueness of my initial postings was part of the problem even though it was my attempt to get open ended brainstorming. Asking whats possible in other threads doesn't mean not taking advice, it means I either realized that an unexpected problem is forcing me to redefine what my needs really are and to compromise somewhere, or i'm exploring one of several potential ways to solve other related problems like data migration to offsite backups. (it's either mail whole 32TB NAS machines around, or mail just the drivesets around and import them)

I just want to build maybe something around a 24-32TB Zpool now. If I can't do what I ideally want i'll just keep compromising the least important aspects even if the plan is to have like 10 boxes this size because I don't see how that changes anything. Get the per-NAS cost down to the minimum overhead cost per drive head then just add boxes. 90% of these machines will be shut off at any given time because once the array is full it doesn't need to be accessed much at all - even if i'm forced to split up one future dataset that's no different than now, it's just split every 32tb instead of every 3tb like I am already. It's not the size of the data but the integrity verification and less sysadmining to get that is all I really care about, ZFS brought me here but if there's another interim solution not as perfect but better than Windows i'm still all ears. It mostly needs to turn on once every few months, verify it's data is intact with a scrub, and then be shut off again to save the data for a future in 10 years when I hope to have the budget to stick everything onto a properly set up program and when i'm hoping i'll find it hasn't lost a bit. What that means right now is sitting in front of the monitor for DAYS going directory by directory checking files against their md5's. By 10 years i'll be working what I hope is a six figure job and am committed enough to write out a big check to have others oversee the archive as a nonprofit. Until that point the money just isn't available no matter how much it needs to be, and it's about saving what can be saved within my means if I care enough to try.

Being told I cant do it with ZFS just forces me to keep doing it through Windows 7 and to keep dealing with the extra sysadminning time required to keep making md5's and RAR'ing up everything to make recoverable archives to stick onto hard drives for a clunky workaround to prevent regular data loss. It slows the rate i'm able to capture data because I spend more time doing that than I do capturing right now. If what i'm hearing is that is as good as it gets I either have to get used to it or I just abandon my project to salvage things.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I know it's starting to sound like a broken record but it's more that I think i'm not being understood that has me re-ask from a different angle certain things. I don't understand what fundamental difference it is that is occurring. As near as I can tell:

One Zpool 8TB = TOTALLY SAFE! Fine for "normal people" to create and sysadmin themselves.
One Zpool 16TB = TOTALLY SAFE! Fine for "normal people" to create and sysadmin themselves.
One Zpool 32TB = TOTALLY SAFE! Fine for "normal people" to create and sysadmin themselves.
One Zpool 33TB = You need a 100k/year expert!
One Zpool 32TB = Not okay anymore because you asked about the forbidden size, you need a 100k/year expert.
One Zpool 16TB = Not okay anymore because you asked about the forbidden size, you need a 100k/year expert.
One Zpool 8TB = Nope, too late, you said something you shouldn't, you wont be recommended to build anything at all now.

or just:

Two Zpool machines 17TB each totaling 34TB = You still need a 100k/year expert even if normal people can run a single 17TB system themself because that's over the allowable amount of storage on one premesis.

I'm sorry to sound frustrated but either ZFS and FreeNAS works with less data corruption and sysadmining than Windows 7 or it doesnt. It either works well up to some barrier or it doesn't. If I can't do "beyond the frontier" then I just want to know what I can do that's within the known so i'm hoping this is reasonable. I don't see how changing what I was asking about still merits the same response, especially after I realize that perhaps the vagueness of my initial postings was part of the problem even though it was my attempt to get open ended brainstorming. Asking whats possible in other threads doesn't mean not taking advice, it means I either realized that an unexpected problem is forcing me to redefine what my needs really are and to compromise somewhere, or i'm exploring one of several potential ways to solve other related problems like data migration to offsite backups. (it's either mail whole 32TB NAS machines around, or mail just the drivesets around and import them)

Nope. Your noobness is showing through with each post. This one is no different(no offense).

You're oversimplifying everything(which is pretty common among newbies). Nobody on this forum will EVER say something like "this is safe and this is unsafe" What an admin is capable of doing safely(and inexpensively) is heavily based on their IT experience and knowledge. Lots of IT experience = lots of options(safely). Not enough experience = lots of tears. When someone shows up and says "I don't work in IT and I don't want to be a sysadmin" the generic(and very accurate AND very safe) response is "find an IT guy with a background in ZFS AND IT". Just knowing FreeBSD/FreeNAS/ZFS isn't enough. You really have to be familar with computer hardware too. The reason is that there are people that have built systems with just 5TB of storage space and lost everything because they didn't know better. The forum is NOT the appropriate setting to spend months of research and then think you know enough to do ZFS safely. The required knowledge to do FreeNAS safely is beyond just a manual and a forum. My mom couldn't install FreeBSD even with a youtube video to follow. My closest friend could install FreeNAS with just an ISO file, but never administer it regardless of size. Still others could build a system of compatible hardware, but never use FreeBSD/FreeNAS for their life. The required knowledge is a combination of hardware, software, administration experience as well as knowledge in ZFS, Unix permissions, and more. If your data is important you have to decide for yourself how much time you want to spend with the hope of "maybe" doing it right. There are people that have worked in IT 10 years, build a FreeNAS system, and 60 days later lost irreplaceable data.

So when you get told to find an IT guy(which was me) I'm not saying you should. I'm saying you'd better if you value your data. You can't get a crash course in this stuff, and you WILL lose data. Don't expect significant forum help if you can't even explain in detail what you did wrong either. The solution(if there even is one) depends HEAVILY on YOU knowing what went wrong. The average person that hasn't worked in IT for years is not going to have this answer... so they lose everything.

The #1 rule for FreeNAS based on my experience in this forum.. knowing when you realize you are in over your head and willing to just say "I can't do this on my own". There's plenty of people that swear they can do it on their own. The ones that couldn't.. they lose their data and realize they couldn't make it over the wall. The rest of us are thrilled to be ZFS. ZFS/FreeNAS/FreeBSD is more of a rite of passage, not a free-for-all. Either you can swim with the big boys or you can't. Nothing wrong if you can't, but being able to acknowledge you can't is far more important to your data than incorrectly convincing yourself that you can. This is why I told you a week+ ago to get an IT guy or prepare to lose your data at least once. I don't say that lightly. But if 10+ year IT veterans can make stupid makes what does that say for a guy that doesn't want to be a sysadmin? We've had 3 people lose 100% of their data after months of no problems in the last 2 weeks. All of them work in IT.

I just want to build maybe something around a 24-32TB Zpool now. If I can't do what I ideally want i'll just keep compromising the least important aspects even if the plan is to have like 10 boxes this size because I don't see how that changes anything.

And that just reinterates that you don't know enough to build a FreeNAS server safely. And the "compromising" begins. And because of your inexperience in IT you'll compromise more than you want until you've realized too late that you compromised your data integrity because of 1 stupid mistake and you'll be telling yourself "but I didn't know better".

It's not the size of the data but the integrity verification and less sysadmining to get that is all I really care about, ZFS brought me here but if there's another interim solution not as perfect but better than Windows i'm still all ears. It mostly needs to turn on once every few months, verify it's data is intact with a scrub, and then be shut off again to save the data for a future in 10 years when I hope to have the budget to stick everything onto a properly set up program and when i'm hoping i'll find it hasn't lost a bit.

And do you really think that having multiple boxes is going to require less sysadmining than one large one? How are you going to do the server regular maintenance(you do know what ALL the aspects of FreeNAS maintenace is.. right?) on a server that is off more than on? The problem is that people don't look for the signs of zpool degrading, nor do the even recognize them when they are flashing in their face. Next thing they realize is they've lost 3 drives in a RAIDZ2 and there is no fix and their backups are either non-existent, or not as pristine as they had hoped despite testing them previously.

Being told I cant do it with ZFS just forces me to keep doing it through Windows 7 and to keep dealing with the extra sysadminning time required to keep making md5's and RAR'ing up everything to make recoverable archives to stick onto hard drives for a clunky workaround to prevent regular data loss. It slows the rate i'm able to capture data because I spend more time doing that than I do capturing right now. If what i'm hearing is that is as good as it gets I either have to get used to it or I just abandon my project to salvage things.

And time = money.. as people have been saying since the beginning of time. :P

I worked on a project years ago in the military that is somewhat similar to your situation. There was an easy solution that would have saved countless hours every week that I was forced to spend adminstering the project. It had a VERY high barrier to entry though because of the knowledge and experience required. The military wasn't going to pay for it because they found it "too expensive" but had no problem letting me work absurdly long periods of time trying to administrate it properly. I eventually realized I was wasting my time. If the military wasn't going to pay to do it right why should I continue to do it wrong? We're not even talking big money here. This is why I had previously said something like "your data is only worth what you're willing to pay to protect it".

Anyway, I'm done trying to respond to your questions. Not because I don't care but because I really don't want to give the impression that this is even a reasonably smart decision to be making on your part. Notice how few people are responding to your post. Many because they see another person show up that "has high hopes and low knowledge and cash"(and they show up ever few months with no clue what they are getting themselves into) and most others because you are asking for system requirements that far exceed their level of knowledge to respond intelligently. It's easy to identify those with the skills to build a big system and those that are in over their head. If they knew what they were doing they'd be posting system specs for their current hardware for a crazy big system and asking a question, not asking a bunch of newbie questions. I have yet to see a newbie that has plans to build a big system actually not lose data while trying to figure out the mess that can be FreeBSD/FreeNAS. Part of the challenge is saying "I can do it". But at the same time people need to realize when they are in over their head and back down before they lose data. It is entirely possible to build a system that appears to work reliably for weeks or months, then suddenly find they have no data because they didn't know the warning signs. I bet I could find 10 people that have been in those shoes this year in less than 30 mins.
 

Veti

Dabbler
Joined
Mar 11, 2013
Messages
17
well in my opinion.

if you have little unix experience, freenas will be more difficult to administer for you than windows.

my suggestion is since you are already planning on building a system, do that and test it with freenas.
test hd replacement, importing of pools and some other scenarios that you can think of.
test if performance is what you expect.

if not there are other options: the hardware will work with other operations systems if it works under freebsd.

For example: Windows 8 introduced Storage spaces with refs (similar checksuming like in zfs) but with terrible write performance (like 15mb/s).

I got one idea for you about keeping the price down: You said that once the data is written the server goes offline.
consider getting a jbod like this.
http://unixsurplus.com/product/sgi-...d-bay-sas-sata-ssd-san-array-35-nas-jbod-good

That way you will only need 1 server and can attach 16 disks per unit to one server and with 195$ per jbod its dirt cheap.
Once full. disconnect and attach the next jbod to you system and fill it up.
 

Caesar

Contributor
Joined
Feb 22, 2013
Messages
114
I don't know if this will help or not as I do not fully understand what you are trying to do. I came across this article while researching for my nas box. It really has nothing to do with ZFS or freenas at all but maybe it will work for what you need.

http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/

I believe that they use ATA over Ethernet to access the drives. Also blackblaze gives you details for building the hardware but nothing about the software or configuration.

Good Luck
 
Status
Not open for further replies.
Top