swap problems with 8.2.0p1

Status
Not open for further replies.

Erik Carlson

Dabbler
Joined
Aug 2, 2012
Messages
19
Hello,

I was a happy user of 8.0.4 on an HP microserver N40L with 8 GB RAM, 2 x 1GB hard disks (mirrored) and 1 x 3TB hard disk (not mirrored), boot drive is a 32GB USB key not that that's probably relevant.

I upgraded to 8.2.0p1 the weekend after it has been released, however I have had problems since.

I upgraded using the boot CD method. After upgrade all looked well. I installed the plugins jail really out of curiosity, i tried to install firefly which was answered by "an error occured" in the GUI , i couldn't get past this and didn't really care so I disabled plugins.

I have enabled autotune. I have AFP, CIFS, SMART monitoring and SSH on, everything else is off.

Under 8.0.4 i almost never saw any swap usage at all. Under 8.2.0 I don't need to be doing anything and several GB of swap can be in use, and the big problem is that under modest load, (rebuilding a 300GB file on the 3TB volume) the swap usage rises and after 30 mins to 1hour the system slows down. web access stops , file server is reported as having closed down by the clients and eventually all i can do is reboot from the console.

The logs show things like

+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(12): failed
+swap_pager_getswapspace(6): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(12): failed
+swap_pager_getswapspace(9): failed
+swap_pager_getswapspace(5): failed
+swap_pager_getswapspace(16): failed
+pid 15623 (smbd), uid 1001, was killed: out of swap space
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(16): failed
+swap_pager_getswapspace(8): failed

I don't think this has been reported elsewhere, i've looked!

I backed up my 8.0.4 settings and i think rollback to 8.0.4 is as simple as reinstall onto the stick and reimport of settings right?, so this is what I plan to do soon if I cannot solve the problem.

Does anybody have any ideas?

Thank you in advance.

Erik Carlson
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I backed up my 8.0.4 settings and i think rollback to 8.0.4 is as simple as reinstall onto the stick and reimport of settings right?

Correct.

What's the configuration of your drives? Are the 2 drives that are mirrored(and I'm thinking 1TB and not 1GB disks like you said above..)storing the 300GB file that you are rebuilding? What do you mean when you say "rebuilding"? Have you tried doing a scrub of your zpool? When you setup your pool had you disabled the swap space?
 

Erik Carlson

Dabbler
Joined
Aug 2, 2012
Messages
19
Correct.

What's the configuration of your drives? Are the 2 drives that are mirrored(and I'm thinking 1TB and not 1GB disks like you said above..)storing the 300GB file that you are rebuilding? What do you mean when you say "rebuilding"? Have you tried doing a scrub of your zpool? When you setup your pool had you disabled the swap space?

Thanks for the reply.

You are right I meant 1TB disks not 1GB. Apologies.

The 300GB file in the example was a hard drive image for an unrelated machine which I originally stored uncompressed on the 3TB non mirrored volume. By rebuilding I mean that i was converting the image (using software on a client computer) to a compressed image, being written back to the same 3TB volume across the network. This was the only real network and NAS traffic at the time.

I have not done an manual scrub, however I do have scrubbing scheduled and a scheduled scrub has taken place since the upgrade I believe. I read the emails from the NAS daily which report both the 1TB paid and 3TB as healthy, although the get swap space failed errors do appear in the dmesg part of the security report.

No I did not disable swap when I installed. I went with the defaults and it worked fine under 8.0.4 for similar operations (though not exactly the same file as in the example above).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Have you previously done the same tasks with other files of roughly the same size? Have you tried disabling the autotune?

None of my machines have ever used swapspace before. But I believe that swapspace is used when the CPU can't process the data fast enough and needs a temporary location to store data to be written to the zpool later. If this is correct, my guess is the CPU isn't powerful enough to handle the whole process with the 300GB file and you run out of swap space. Presumably, once the swapspace is full as well as RAM then the system slows to a crawl because it's being starved for resources. My guess would be if you let it finish everything instead of rebooting it actually will come back.

Perhaps someone else with more experience with swap can speak up. I've always understood that if you are regularly using swapspace then you need a more powerful CPU.

I googled your system. You have unfortunately maxed out your RAM(that's always an easy fix). I really can't vouch for the CPU(not an AMD person) so I don't know how that CPU compares to others or if it is powerful enough to get the job done.

I suppose it's possible one of the drives may be failing and it's just a coincidence that this started after you upgrade to 8.2. Hopefully paleoN will pipe up with some good ideas or the answer.
 

Erik Carlson

Dabbler
Joined
Aug 2, 2012
Messages
19
Have you previously done the same tasks with other files of roughly the same size? Have you tried disabling the autotune?

None of my machines have ever used swapspace before. But I believe that swapspace is used when the CPU can't process the data fast enough and needs a temporary location to store data to be written to the zpool later. If this is correct, my guess is the CPU isn't powerful enough to handle the whole process with the 300GB file and you run out of swap space. Presumably, once the swapspace is full as well as RAM then the system slows to a crawl because it's being starved for resources. My guess would be if you let it finish everything instead of rebooting it actually will come back.

Perhaps someone else with more experience with swap can speak up. I've always understood that if you are regularly using swapspace then you need a more powerful CPU.

I googled your system. You have unfortunately maxed out your RAM(that's always an easy fix). I really can't vouch for the CPU(not an AMD person) so I don't know how that CPU compares to others or if it is powerful enough to get the job done.

I suppose it's possible one of the drives may be failing and it's just a coincidence that this started after you upgrade to 8.2. Hopefully paleoN will pipe up with some good ideas or the answer.


Hi there,

following up on this, I disabled autotune and disabled and deleted the tuneables it created and thought my problems had gone away. I had almost 2 days of normal service. However the problem is back.

I am currently seeing 5G of swap usage , with no corresponding interface (less than 3Mbps) activity at all. looking at the graphs the swap usage has been creeping up the last few hours. I should add that before 8.2.0 I was not using any swap space (not that i noticed anyway) nor was i even remotely taxing the CPU.

I just don't understand. If nobody has any more suggestions i will just go back to 8.0.4.

Thanks
Erik
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
Have you previously done the same tasks with other files of roughly the same size? Have you tried disabling the autotune?

None of my machines have ever used swapspace before. But I believe that swapspace is used when the CPU can't process the data fast enough and needs a temporary location to store data to be written to the zpool later. If this is correct, my guess is the CPU isn't powerful enough to handle the whole process with the 300GB file and you run out of swap space. Presumably, once the swapspace is full as well as RAM then the system slows to a crawl because it's being starved for resources. My guess would be if you let it finish everything instead of rebooting it actually will come back.

Perhaps someone else with more experience with swap can speak up. I've always understood that if you are regularly using swapspace then you need a more powerful CPU.

I googled your system. You have unfortunately maxed out your RAM(that's always an easy fix). I really can't vouch for the CPU(not an AMD person) so I don't know how that CPU compares to others or if it is powerful enough to get the job done.

I suppose it's possible one of the drives may be failing and it's just a coincidence that this started after you upgrade to 8.2. Hopefully paleoN will pipe up with some good ideas or the answer.

If swap is used one needs more memory. CPU is not the problem, memory is. Cure either by consuming less ( = fewer and smaller processes) or by expanding memory.
 

Erik Carlson

Dabbler
Joined
Aug 2, 2012
Messages
19
If swap is used one needs more memory. CPU is not the problem, memory is. Cure either by consuming less ( = fewer and smaller processes) or by expanding memory.
Under normal circumstances I would agree, however I was not doing anything different with the NAS that I was not doing under 8.0.4, and as I have already said the NAS would use swap while it's doing nothing on the LAN which suggests a bug, memory leak or something to me.

To prove this I went back to 8.0.4 almost immediately after my last post. I have used absolutely no swap at all, and very little physical memory, despite having the same usage pattern and testing with the same large files that exacerbated the problem under 8.2.0.

I would like to go back to 8.2.0 but can't while it still has this bug, so I think I will have to stay with 8.0.4. I'm also worried about 8.3.0 as there seems to be no downgrade option due to the new ZFS format.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Erik,

8.2 has alot of new features, bug fixes, etc. You really need to decide what you want to do(which it seems like you have). I would say it's pretty typical for each new version of FreeNAS to need more more RAM as more features and such are added(how much more will depend on alot of things). In your case your decision to stay on 8.0.4 makes sense, but you pretty much WILL have to upgrade your RAM or identify the exact bug causing your problem. The thumbrule for ZFS is 6GB of RAM + 1GB of RAM for each TB of hard drive space. In your case, the thumbrule per the manual would recommend 12GB(50% more than what you have). Not the optimal situation to be in. Keep in mind that if you stay on 8.0.4 you are at your own mercy for bugs and security vulnerabilities.

If you read the 8.3 release notes the following comment is posted:
ZFS v28 includes deduplication, which can be enabled at the dataset level. The more data you write to a deduplicated volume the more memory it requires, and there is no upper bound on this. When the system starts storing the dedup tables on disk because they no longer fit in RAM, performance craters. There is no way to undedup data once it is deduplicated, simply switching dedup off has NO AFFECT on the existing data. Furthermore, importing an unclean pool can require between 3-5GB of RAM per TB of deduped data, and if the system doesn't have the needed RAM it will panic, with the only solution being adding more RAM or recreating the pool. Think carefully before enabling dedup! Then after thinking about it use compression instead.

In essence, if you had a pool with dedup everything may run great until you have an unclean pool. If you don't have enough RAM you'll never be able to get your pool out of the unclean state and the ONLY fix is more RAM. These are issues admins of ZFS pools will have to consider and deal with in the future. I'm sure a few people(especially new users) will get bit by this. Too many people jump in feet first and make critical errors that they pay for dearly later. I do realize this isn't your situation, it's just an example of how RAM can be fine today and not fine tomorrow. :P

Your issue is very unique. To be honest, I'm not sure why you'd choose to use ZFS, then have a 3TB drive that isn't mirrored. Seems to be counter-intuitive. The whole point of ZFS is to have high reliability, but you have none if you aren't running a mirror or RAIDZ. There may be some kind of issue with the current version of ZFS when not in a RAID-type function. /shrug

Please keep in mind that when 8.3 comes out you will have to upgrade your zpool manually to support v28. When 8.3 is released(or if you have no problem testing the betas) I would recommend you install 8.3 and see if the situation changes. If it runs great for a long enough period of time to make you comfortable it will not have a problem you can upgrade the zpool using the 'zpool upgrade' command. At that point you will not be able to revert to any previous FreeNAS version and still access the zpool(s). Upgrading the zpool is completely optional.

Personally, I'm not sure what my stance is on upgrading a zpool to a newer version. Reading other forums I've found alot of people have the opinion that you shouldn't upgrade a zpool unless you intend to use a feature that the new ZFS version supports. This thought process is for backwards compatibility. Until I started reading this everywhere I've always had the thought that newer is better and I'll upgrade as soon as I verify 8.3 is reliable on my setup.

Just trying to clarify your situation... sorry for the long post. Hopefully it helped. :)
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Under normal circumstances I would agree, however I was not doing anything different with the NAS that I was not doing under 8.0.4, and as I have already said the NAS would use swap while it's doing nothing on the LAN which suggests a bug, memory leak or something to me.
Autuned values under 8.2 are different than the values under 8.0.4. They lowered the amount given to ZFS to have some memory for the jail. If you aren't using the jail you can safely increase the memory to ZFS.
 

Erik Carlson

Dabbler
Joined
Aug 2, 2012
Messages
19
Erik,

8.2 has alot of new features, bug fixes, etc. You really need to decide what you want to do(which it seems like you have). I would say it's pretty typical for each new version of FreeNAS to need more more RAM as more features and such are added(how much more will depend on alot of things). In your case your decision to stay on 8.0.4 makes sense, but you pretty much WILL have to upgrade your RAM or identify the exact bug causing your problem. The thumbrule for ZFS is 6GB of RAM + 1GB of RAM for each TB of hard drive space. In your case, the thumbrule per the manual would recommend 12GB(50% more than what you have). Not the optimal situation to be in. Keep in mind that if you stay on 8.0.4 you are at your own mercy for bugs and security vulnerabilities.

If you read the 8.3 release notes the following comment is posted:


In essence, if you had a pool with dedup everything may run great until you have an unclean pool. If you don't have enough RAM you'll never be able to get your pool out of the unclean state and the ONLY fix is more RAM. These are issues admins of ZFS pools will have to consider and deal with in the future. I'm sure a few people(especially new users) will get bit by this. Too many people jump in feet first and make critical errors that they pay for dearly later. I do realize this isn't your situation, it's just an example of how RAM can be fine today and not fine tomorrow. :P

Your issue is very unique. To be honest, I'm not sure why you'd choose to use ZFS, then have a 3TB drive that isn't mirrored. Seems to be counter-intuitive. The whole point of ZFS is to have high reliability, but you have none if you aren't running a mirror or RAIDZ. There may be some kind of issue with the current version of ZFS when not in a RAID-type function. /shrug

Please keep in mind that when 8.3 comes out you will have to upgrade your zpool manually to support v28. When 8.3 is released(or if you have no problem testing the betas) I would recommend you install 8.3 and see if the situation changes. If it runs great for a long enough period of time to make you comfortable it will not have a problem you can upgrade the zpool using the 'zpool upgrade' command. At that point you will not be able to revert to any previous FreeNAS version and still access the zpool(s). Upgrading the zpool is completely optional.

Personally, I'm not sure what my stance is on upgrading a zpool to a newer version. Reading other forums I've found alot of people have the opinion that you shouldn't upgrade a zpool unless you intend to use a feature that the new ZFS version supports. This thought process is for backwards compatibility. Until I started reading this everywhere I've always had the thought that newer is better and I'll upgrade as soon as I verify 8.3 is reliable on my setup.

Just trying to clarify your situation... sorry for the long post. Hopefully it helped. :)

Hi Noob Sauce,

Thanks for taking the time for such a long answer. With the cost of RAM I absolutely would throw more RAM at the problem if I could but as you observed in an earlier post 8GB is my limit unless I go for completely new hardware.

TBH I am completely happy with what 8.0.4 does for me in terms of function. Prior to my current FreeNAS box we had a Buffalo Linkstation Duo Pro solution which was awful and is blown away by FreeNAS. However I agree that I should do something to resolve the bugs as I will have to upgrade at some point since 8.0.4 will presumably not be maintained.

My problem is that I don't know how to trouble-shoot these bugs at the moment. 8.0.4 worked so well that I have not delved into the documentation as deeply as perhaps I should. You have convinced me that I need to do a lot more reading before going anywhere near de-dup.

You are right that I don't need ZFS for the 3TB volume. my mirrored 1TB volumes are for data that I care about. the 3TB drive was just something I had lying around and I only use it for temporary storage, for instance throwing around large drive images. But again it worked fine under 8.0.4. I agree that it's counter intuitive. I'll have to read up on my non-ZFS options.

I take on board your comments about 8.3 . I didn't actually appreciate that you could use 8.3 without upgrading the pool. I've much more confidence now to try 8.3 knowing I can go back to 8.0.4 if I need to as long as I don't upgrade pools.

I will have to do more reading and try and get to the bottom of these bugs. So far i've tried umpteen combinations of settings and looked at the process table / top for anything that looks like it's hogging resources but no joy.

Thanks for your help.
 

Erik Carlson

Dabbler
Joined
Aug 2, 2012
Messages
19
Autuned values under 8.2 are different than the values under 8.0.4. They lowered the amount given to ZFS to have some memory for the jail. If you aren't using the jail you can safely increase the memory to ZFS.

Thanks paleoN,
I did disable autune and disable and remove all the tunables /sysctrls that it added, I also removed the jail and disabled that service,rebooted, but the problem continued. Is there something else I should have done to increase the memory to ZFS?
Erik
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I did disable autune and disable and remove all the tunables /sysctrls that it added, I also removed the jail and disabled that service,rebooted, but the problem continued. Is there something else I should have done to increase the memory to ZFS?
Increase the values of the tunables? These 3 in particular: vm.kmem_size, vm.kmem_size_max and vfs.zfs.arc_max. Run the following to see what they are currently set to:
Code:
sysctl vm.kmem_size_max

sysctl vm.kmem_size

sysctl vfs.zfs.arc_max
Note vfs.zfs.arc_max needs to be smaller than vm.kmem_size, and vm.kmem_size needs to be smaller than vm.kmem_size_max.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
So, reading all the way through this I would suggest enabling autotune and then increasing vm.kmem_size, and vm.kmem_size_max.

ZFS itself won't swap, but if there isn't enough kmem left it might cause other things to. Maybe you will get lucky and whatever you were doing was causing you to exhaust kmem. Otherwise as peterh alluded to you need to figure out why you are swapping.
 
Status
Not open for further replies.
Top