Understanding ZFS behaviour regarding disk writes and RAM usage!

Bytales

Dabbler
Joined
Dec 17, 2018
Messages
31
I know this isnt technically a FreeNAS Qeustion, but it relates to zfs.
Im currently running Linux Mint tara 19.1, but i packed a custom ubuntu kernel with 5.1.9 kernel and zfs 0.8.1, which i install it.
Thus i have the latest zfs under the latest linux kernel.
System consists of 32core epyc cpu and 192gb Ram.
Heres the deal.
I created a datastore with lz4 compression, no dedup and 512kb sectors. On this dataset, i copied from a nvme ssd 140 gb of data, 8 times. Data consists of allmost 73k files, each aproximatly 2mb big. So thats allmost 600k of files.
The zpool consists of a single WD gold 10 Tb drive.
For the copy the system managed to copy with an allmost consinstent 175 mb per second from the ssd to the data store. 140gb times 8 means a bit more over 1 terabyte. Each 140 gb of data took between 10 and 12 minutes.
During this copy, i observed ram loading for a total of 44gb of ram. (Thats the amount of ram which remained occupied after the copy was done,and i killed all ram consuming apps)
Even though the copying happened with 170 mb per second, i barely heard the disk drive spinning, which is very strange. Moreover, after i finished copying the whole stuff, the ram remained occupied. The only way to "flush it" was to restart linux.

Alas after restart,i had to import the zpool again and to mount the data sets again. (Still need to figure how to make them mount at startup automagically)
Now the ram was free.

What gives? If i had waited more, would the ram had been cleared? How much ram does zfs really need? If i had 2tb of ram, would that have been occupied in its entirety as well?
Is there a way to manually flush the occupied ram after such an intensive write session? If the dataset is being constantly written onto, should i expect the whole remaining ram to get occupied? I need between 130 and 160 gb of ram for my load, and I dont really have that much ram to spare. If it is indeed necessary, I could add more ram, my motherboard has 16 slots and only 8 are occupied.
But hell, ive seen the whole ram load up. never had such a ram intensive load anywhere before.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
ZFS will use *all* the RAM it can. For writes, it will use a large chunk to create transaction groups that commit to disk, generally writing contiguous (sequential) ranges where possible. If your pool is nearly empty, you shouldn't hear much seeking (seeking on writes is a sign of fragmentation).

For reads, ZFS will hold basically any data that it's read off of disk in the ARC, until the data is freed on disk (in which case it is freed in ARC) or until the ARC is sufficiently crowded that it is evicted from ARC in favor of something more useful to cache. That's the "AR" in ARC. If the system itself is under memory pressure, ZFS will release portions of the ARC so that the system isn't memory starved.

You don't flush this. It self-manages.

ZFS on FreeNAS typically requires a base 8GB plus an additional 1GB per TB of disk space to get "decent" performance. This softens a bit as the number of TB managed gets out past maybe 20 or so. But some more demanding workloads will require substantially more RAM. If you were running something like dedup, the RAM requirement is at least 5GB per TB of disk space - probably more.

If you have 2TB of RAM and a filled 2TB HDD, and you read all that 2TB in, ZFS will happily consume your entire 2TB of RAM with cache data and never need to perform a read from your HDD. This is the butter zone for ZFS reads, and is also a great situation for writes because the entire I/O capacity of the HDD is available for writes.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
Free RAM is wasted money. Be happy with RAM being fully used.
 

Bytales

Dabbler
Joined
Dec 17, 2018
Messages
31
Hot Damn !
Is there a way to specify somehow how much ram it is allowed to draw ? Sometimes it draws so much, there isnt any left for my needs.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
Hot Damn !
Is there a way to specify somehow how much ram it is allowed to draw ? Sometimes it draws so much, there isnt any left for my needs.
By default ZoL will use half the total system RAM I believe; do you actually need more than 96GB for non-ZFS workloads on this machine? If you do, that's fine; just don't fall into the trap of premature optimization.

You can adjust this amount by looking for the "maximum ARC" tunable.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
there isnt any left for my needs.
How much memory do you really need and what do you need it for? ZFS is not a light weight file system. Why are you creating a storage pool on a single 10TB drive? That is no good for redundancy and it is not good for performance either. What is this system for?
Im currently running Linux Mint tara 19.1, but i packed a custom ubuntu kernel with 5.1.9 kernel and zfs 0.8.1, which i install it.
Why did you use Linux?
 

Bytales

Dabbler
Joined
Dec 17, 2018
Messages
31
By default ZoL will use half the total system RAM I believe; do you actually need more than 96GB for non-ZFS workloads on this machine? If you do, that's fine; just don't fall into the trap of premature optimization.

You can adjust this amount by looking for the "maximum ARC" tunable.
Thats what i was looking for. I need to somehow be able to specify how much RAM zfs is allowed to use. If that can be done, than i might keep it.

How much memory do you really need and what do you need it for? ZFS is not a light weight file system. Why are you creating a storage pool on a single 10TB drive? That is no good for redundancy and it is not good for performance either. What is this system for?

Why did you use Linux?
I initially started using Free NAS as a VM under ESXI, because i ran 2 Windows VMs under ESXI, an i simply needed a way to acces a single HDD from both Windwos VMs at the same time. Thats why i used FreeNAS, becuase it allowed me to set up a share which could be accesed from both Windows VMs.

Now i switched to linux, because the Linux Ubuntu VM i ran under ESXI didnt have native acces to the hardware. THe linux Mint i am running now is installed as main OS, thus having full acces to the whole Computer Hardware. I packed ZFS here, simply to be able to acces here my old Zpool created in free NAS. However i learned perfomance i lacking under ZFS (eating up to much ram, not leaving enough for the program im running, and alos disk write speeds are low enough to be useless)

Now i deleted the Zpool, and formated the 10TB drive as a ext4 system, and it seems im getting higher write speeds like this, and im yet to test my program, to see if it keeps up with the write capabilities of the ext4 file system.

Probably ZFS would be usefull for my load, if i would add 2 or 3 more 10 GB WD drives similar to what i have and create a pool from all of them (thus getting increased write speed)

Maybe a single drive using ext4 file sytem on it, is the way to go.

How do i "limit" the maximum amount of ram ZFS can use ?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702

Bytales

Dabbler
Joined
Dec 17, 2018
Messages
31
I learned however that what i need is "write power" and that the arc basically covers for "read power".
As such, I can increase "write power" by adding multiple hdds in a zpool, which i would do using zfs, or by simply buying a SSD.

I think no matter what i do, im limited by the writing capbilities of the single hdd i have.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
Maybe a single drive using ext4 file sytem on it, is the way to go.
If you don't care about the integrity of your data, by all means, continue.

I can increase "write power" by adding multiple hdds in a zpool, which i would do using zfs, or by simply buying a SSD.
In a striped ZFS pool (no redundancy), each drive will increase the potential write performance (I/O) capacity of the pool. Again if you don't care about your data, that's a valid option.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
I learned however that what i need is "write power" and that the arc basically covers for "read power".
Correct, but you were concerned with controlling the amount of RAM used by ZFS, which is mostly ARC... performance is another question and will be a tradeoff with complexity, redundancy, cost and the nature of your data.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Thats what i was looking for. I need to somehow be able to specify how much RAM zfs is allowed to use. If that can be done, than i might keep it.

Why do you feel you need to specify this? As I previously said, it self-manages this and will adapt to whatever is available.

You're basically trying to tune a high performance engine when you don't really even understand how the engine works. This has the potential to end badly.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
I need to somehow be able to specify how much RAM zfs is allowed to use. If that can be done, than i might keep it.
If you are still running in ESXi you could just limit your amount of RAM to the FreeNAS VM, it is one of the parameters you have to setup. That is the simple solution however realize that you could be trading off performance. You have a lot of folks telling you to let the system run. The RAM is used however best FreeNAS/FreeBSD deems best. This means that if RAM must be freed up to run something else within FreeNAS, then it will do so, otherwise it will use the RAM you have given it to make your system more responsive.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
I need to somehow be able to specify how much RAM zfs is allowed to use. If that can be done, than i might keep it.
As mentioned by several other users, ZFS will self-manage the amount of RAM it is consuming. If you have another workload that is demanding memory, it should release the memory it's claimed for ARC so that the other processes can use it. "Free RAM is wasted RAM" is something else that was written here, and it's correct.

See how the system responds under the actual shared memory workload, and ensure that the ZFS process releases memory from ARC.

I think no matter what i do, im limited by the writing capbilities of the single hdd i have.
Also correct. You'll be able to absorb a small burst of data if you're writing asynchronously, but eventually that data needs to get spooled out to stable storage, and with only a single disk you'll be waiting on that. More disks or faster disks is the answer.
 

ChrisReeve

Explorer
Joined
Feb 21, 2019
Messages
91
ZFS on FreeNAS typically requires a base 8GB plus an additional 1GB per TB of disk space to get "decent" performance. This softens a bit as the number of TB managed gets out past maybe 20 or so. But some more demanding workloads will require substantially more RAM. If you were running something like dedup, the RAM requirement is at least 5GB per TB of disk space - probably more.
Do you have documentation for this (RAM requirements with deduplication OFF), and what do you define as "decent" performance? With deduplication, yes, but with it turned off, I believe RAM requirements, over a minimum size (which I won't try to guess), will give you decent performance, even if you have significantly less than the assumed 1GB RAM per TB.

Again, for my pool: I have 10x10TB WD100EMAZ drives in a single RAIDz2. Dedup off, encryption and compression on. I initially ran with 64GB RAM (DDR3 ECC, 1600MHz). This is the setup I used when testing SMB performance, and I saw sustained writes passing 600MB/s over hundreds of GB. I upgraded to 128GB RAM, and saw no significant increase in performance.

I then went over to a virtualized setup, ESXi 6.7 as the hypervisor (which was the reason for the RAM upgrade). Initially, I dedicated 96GB RAM to freeNAS, and 4 physical cores (out of 16, I run 2x E5 2650 v2). Again, identical performance. I then installed a Intel DC P3700 as both SLOG (20GB), and L2ARC (256GB), and the rest for overprovisioning. Running an internal storage network, then passing a iSCSI drive back to ESXi, I made a datastore, and installed VMs on it. This gave me almost 1,5GB/s of throughput (testing with CrystalDiskMark on a Windows 10 VM running of the iSCSI).

I'm not claiming to have a better understanding of ZFS than you, far from it, I have used many of your posts to gain a better understading of how it works. But I still believe that the minimum RAM requirements/recommendations of 1GB per TB are somewhat overstated.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
still believe that the minimum RAM requirements/recommendations of 1GB per TB are somewhat overstated.
You're welcome to do the testing and write it up... this is a volunteer support forum and the reason documents like that don't exist is that people don't have the time to do the research and write them.

If you search around a bit, you can find examples of people having done some performance testing and sharing the results, but few of those will have investigated "low RAM" configurations (since most people just understand that skimping on RAM isn't good for perfomance without the need for minute detail, hence the rule of thumb approach shared by the forum).
 

ChrisReeve

Explorer
Joined
Feb 21, 2019
Messages
91
You're welcome to do the testing and write it up... this is a volunteer support forum and the reason documents like that don't exist is that people don't have the time to do the research and write them.

If you search around a bit, you can find examples of people having done some performance testing and sharing the results, but few of those will have investigated "low RAM" configurations (since most people just understand that skimping on RAM isn't good for perfomance without the need for minute detail, hence the rule of thumb approach shared by the forum).
Thats the thing, I have done some testing, but will continue. Also, all the recent testing I have seen (last few years, as opposed to 6-7 years ago) show that, if you do not use deduplication, you can get away with far less RAM than the stated rule, once you go above 32GB RAM. Hence the updated guidelines. If you claim that you will see a significant loss of performance going below 1GB RAM per TB storage, you are wrong. The answer is that it depends. Anything over 32GB will be sufficient for at least 80-100TB. Again, for most usecases.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Thats the thing, I have done some testing, but will continue. Also, all the recent testing I have seen (last few years, as opposed to 6-7 years ago) show that, if you do not use deduplication, you can get away with far less RAM than the stated rule, once you go above 32GB RAM. Hence the updated guidelines. If you claim that you will see a significant loss of performance going below 1GB RAM per TB storage, you are wrong. The answer is that it depends. Anything over 32GB will be sufficient for at least 80-100TB. Again, for most usecases.

This is highly dependent on many factors, and definitely not safe "for most usecases." It will be fine for archival purposes, probably fine for single user or media storage. For a busy departmental fileserver, it depends on usage patterns, but would probably bite the big one.

People like to starve ZFS of memory and say "oh it's fine" because they can still access files. If that's your measure of "fine", then, fine, put 100TB on an 8GB NAS and ... well maybe it will be fine.

ZFS needs to be able to cache metadata. The exact amount of RAM that's needed is not really easy to define, because a NAS that has a thousand snapshots and lots of little files has intensely greater metadata needs than a NAS that has no snapshots and is storing large files. If you're doing lots of writing, ZFS needs metaslab and other metadata easily available, and this can require gobs and gobs of memory.

Writing generalized guidelines for this is very difficult, because there will always be someone who's like "Oh I ran my 100TB on 8GB and it was great!" followed immediately by someone who "Oh my 100TB system was so slow on 32GB that I had to replace the mainboard for one that could do 64GB, why didn't anyone warn me."

The typical ZFS guidance is to start at 1GB per TB. This is a level where most fileservers will be JUST FINE unless they're doing something like dedupe. Over-resourcing ZFS generally doesn't result in operational issues, whereas under-resourcing does.
 

Big Data Guy

Dabbler
Joined
May 14, 2020
Messages
15
The typical ZFS guidance is to start at 1GB per TB. This is a level where most fileservers will be JUST FINE unless they're doing something like dedupe. Over-resourcing ZFS generally doesn't result in operational issues, whereas under-resourcing does.

jgreco, how much RAM would be too much then if that's the case?
My NAS is currently have 6x4TB and it has maxed my poor 16GB system, if I upgrade to 6x8TB in near future, should I get 64GB RAM or just 32GB is enough? I do run 2 VMs and Plex Server on it.

The dashboard says the memory is used as follow:
Free: 0.4 GiB
ZFS Cache: 9.5 GiB
Services: 5.9 GiB
is that mean i have 9.5GB left for VM or what not?

Does the 1GB per 1TB rule applied for occupied space only, or the whole disk size?


Thanks
 
Last edited:
Top