Is 32GB of ECC RAM sufficient for 32TB of storage using ZFS?

Status
Not open for further replies.

billhickok

Dabbler
Joined
Oct 8, 2014
Messages
36
The best use case for ZFS with 4TB+ HDD's is archival large file storage, where you are mostly laying down large items and letting them sit there for years. Small file storage is more difficult for spinning media because of the seek overhead. At a point, you will lose the ability to store and retrieve the files before the drive's expected service life is reached... if you do the math, if you are storing 4KB files on an 8TB HDD, assuming you can write 100 of them per second (controlled by seek speed), doing nothing else, it'd take 250 days to fill the disk. I don't care to discuss the fact that it is a contrived example and not realistic from some points of view - it demonstrates the problem spinny rust faces when storing small files in the long term. Also, the ZFS scrub mechanism is a metadata traversal so it tends to suffer on pools storing lots of smaller files.


I just ran into this myself. I had two servers running an E3-1230v2 with 32gb ecc ram, 12x3tb hdd, and performance tanked. I did some testing and determined I needed more ram but was capped by the motherboard. So I had to upgrade; went with an E5-1620v2 with 64gb of ram and things are faster than they have ever been. The board will support 512gb of ram but I doubt I'll put more then 128gb in it. Cost me quite a bit but it was still cheaper then buying something off the shelf.

Sorry, I should point out that these servers aren't my home use they are deployed in a business. My home server is setup with the 1gb ram per 1tb of storage and has operated without issue for years now.

This makes me a bit reluctant about my plan to do 10 x 4TB in RAID Z2. I do have enough data to eventually fill up most of that space, and the majority of my data are small files (anywhere from 5-30mb files.) Though I likely won't be constantly utilizing (streaming or playing) most of the small files, they'll mostly just sit as an archive. I already have another NAS (Synology) which houses most of my big files.

I do plan on doing weekly transfers of around a thousand small files on the ZFS setup. Again, most of the time i'll probably be the only user utilizing this server as it's in a home environment. I really don't want to risk one day running into a brick wall and being forced to upgrade my hardware however. I'm planning on picking up all the hardware in the next few weeks and for me, it's gonna be a pretty significant investment in terms of money. What do you guys think...is there a decent chance i'll run into issues with 32gb of RAM for 10 x 4TB and 1 vdev?
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Mlovelace is running a 10GBe nic and serving something more challenging than media and a few thousand files a week. :) I don't see specifics anywhere. But this workload is not similar to yours. It only demonstrates that the issue is complex, and that performance can tank suddenly.

Seriously you aren't stretching very far at all with 10x4TB for one user. Cyber is at 60TB for one user in one vdev without issue. But is he right on the edge? Is it dependent on how full or fragmented his pool is? Will he hit THE WALL tomorrow? The thing is there is no way for us to define 'decent chance'.

I'd be completely comfortable with any media/archival style setup up through 12x6TB in two 6 disk vdevs. But understand that I wouldn't lose sleep for even one second if the performance tanked and I needed to jump to e5. The disks/ram are where the money is and you can keep those. I'd test to 24x6TB just to see what happens with the understanding I'd likely have to jump platforms. But I wouldn't be surprised in the slightest if it worked nicely while the pool has space. Or the issue was tunable via parameters or l2arc etc. We are way out past the limits of conventional use and experience at that point. The disks didn't exist to do it previously, there was no use case, and it was cost prohibitive.

By the same token 32GB as a limit ticks me off every single day when I want to play hard with esxi based loads and VM's. Mostly because the e3's have plenty of processing power, and the limit feels like an oversight or glitch when even an Avoton can do 64GB. So I can completely identify with... should have bought the e5 on round one. I also won't even pretend to be a "normal" user. But my little e3 server will likely see a very large pool before I am done with it.
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Mlovelace is running a 10GBe nic and serving something more challenging than media and a few thousand files a week. :) I don't see specifics anywhere. But this workload is not similar to yours. It only demonstrates that the issue is complex, and that performance can tank suddenly.

Seriously you aren't stretching very far at all with 10x4TB for one user. Cyber is at 60TB for one user in one vdev without issue. But is he right on the edge? Is it dependent on how full or fragmented his pool is? Will he hit THE WALL tomorrow? The thing is there is no way for us to define 'decent chance'.

I'd be completely comfortable with any media/archival style setup up through 12x6TB in two 6 disk vdevs. But understand that I wouldn't lose sleep for even one second if the performance tanked and I needed to jump to e5. The disks/ram are where the money is and you can keep those. I'd test to 24x6TB just to see what happens with the understanding I'd likely have to jump platforms. But I wouldn't be surprised in the slightest if it worked nicely while the pool has space. Or the issue was tunable via parameters or l2arc etc. We are way out past the limits of conventional use and experience at that point. The disks didn't exist to do it previously, there was no use case, and it was cost prohibitive.

By the same token 32GB as a limit ticks me off every single day when I want to play hard with esxi based loads and VM's. Mostly because the e3's have plenty of processing power, and the limit feels like an oversight or glitch when even an Avoton can do 64GB. So I can completely identify with... should have bought the e5 on round one. I also won't even pretend to be a "normal" user. But my little e3 server will likely see a very large pool before I am done with it.


The specifics of those server work loads are: backup target for vmware cluster, backup target for 6 sql databases and their transaction logs, afp shares, cifs shares, sftp share, nfs volume for vCenter ISOs (the vm's and sql databases live on netapp SANs), snapshot replication for offsite data security, and rsync target for linux clients. I put our freenas servers through their paces and they keep on trucking along, plus I can build these server for at least 1/3 of the cost of a dell/hp.

You should have no problem running a home environment with 32gb of ram.
 

esamett

Patron
Joined
May 28, 2011
Messages
345
JGreco: "Moving fragmented data back and forth between pools is helpful in specific instances. One of the prerequisites would seem to be that you'd need a pool that had lots of free space, so that you weren't actually taking contiguous data out of one pool and shotgun spamming it around the other pool because of insufficient space."

My idea would be to copy to another pool, empty recycle bin of donor, then copy data back - hopefully now less fragmented. Would the above concern still hold?

Thanks,

p.s. how do you do those nifty quotations?o_O
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's really a function of how much free space there is. My opinion has gradually evolved to the point where I think it's bad to fill a production pool with lots of write activity beyond maybe 60-70% if you want the best performance possible, but there's actually no magic percentage. It is tied too closely to the use model and the existing state of the pool. However, if you do have a pool that is only 60% full, you are likely to be able to pick data files that you expect to be highly fragmented, copy them from the pool right back onto the pool, and this should result in substantially less fragmentation (for the file) unless you've somehow managed to generate pathological levels of fragmentation on the other allocated blocks in the pool. The point is that having more free space means more elbow room is available for administrative shuffling that doesn't need to involve another pool.

But as I said, the percent full is largely arbitrary; investigations have been done that show pathological situations where performance drops off with as little as ~10% of the pool filled! Part of the answer to this is that you probably need to have some idea of what your workload actually does on disk if you're administering a ZFS pool, and a plan as to how to mitigate that.

As for quoting, you can use the reply option underneath a post to have the software do the hard work, or you can manually surround stuff with QUOTE and /QUOTE tags (have to put them in square brackets).
 
Status
Not open for further replies.
Top