FreeNAS 9.10 STABLE keeps dropping iSCSI connection

Status
Not open for further replies.

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
8 GB is the minimum recommended for a normal FreeNAS system, with you running it as a VM and using iSCSI; I think that it is very much under-powered.

For all but the most trite uses, I'd not recommend less than two cores (with reservation) and 32GB of RAM for iSCSI. It'll "work" with less but it will tend to be flaky.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Hi,
I got a NOP-Out:

Checked the RRD graph:
  • There was some iSCSI IOs at around max 192MB/s read and 114MB/s write.
  • ARC memory reduced from 6.2GB to 6.0GB at that moment, hit rate reduced from 74.9% to around 70.9%
My current ARC is reduced to 5.9GB. (min 5.8GB)
In VMWare event log, beside connection lost near the NOP-Out time, there are also IO latency increased events scattering in various occasion.
I think the 9.10 has problem with memory leakage and the kernel iSCSI still have serious issues. Normally my ARC never went lower than 6.0GB and no swapping happen, here below is the result from top (uptime just more than 1 day):
Code:
last pid: 80687;  load averages:  0.39,  0.29,  0.27    up 1+19:18:31  13:22:58
36 processes:  1 running, 35 sleeping                                         
CPU:  1.0% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.0% idle        
Mem: 84M Active, 1061M Inact, 6712M Wired, 23M Cache, 47M Free                
ARC: 6099M Total, 2543M MFU, 3107M MRU, 450K Anon, 303M Header, 146M Other    
Swap: 8192M Total, 63M Used, 8128M Free
Have you tried increasing the RAM allotment for your FreeNAS VM? Perhaps increasing it to 16GB vs the 8GB you're using now would help. I'm using 16GB and was unable to reproduce iSCSI timeouts w/ 9.10-STABLE, as I described upthread.
 

abcslayer

Dabbler
Joined
Dec 9, 2014
Messages
42
For all but the most trite uses, I'd not recommend less than two cores (with reservation) and 32GB of RAM for iSCSI. It'll "work" with less but it will tend to be flaky.
Have you tried increasing the RAM allotment for your FreeNAS VM? Perhaps increasing it to 16GB vs the 8GB you're using now would help. I'm using 16GB and was unable to reproduce iSCSI timeouts w/ 9.10-STABLE, as I described upthread.

I might increase the RAM allocation for FreeNAS VM but it is a kind of temporary "hack" rather than find the root-cause of the issue.
I don't use dedup, my IO requirement is simple: 02 Windows Server 2012 running small SQL databases. So 8GB for FreeNAS I think more than enough (actually it is pretty fine with 9.2 and other NAS OS like Nexenta, Quantastor,....
 

maglin

Patron
Joined
Jun 20, 2015
Messages
299
The iSCSI service requires RAM. The middleware requires RAM. ZFS requires RAM. 8GB is the minimum. And as things evolve they require more RAM. Otherwise we would all still be using 2MB of RAM and 16 color graphics.
I would recommend you go back to a version of FreeNAS that you know works with what you are willing to provide it. As you said 9.2 worked with it. Or switch to Nexenta. Just rewriting what you said basically.
 

abcslayer

Dabbler
Joined
Dec 9, 2014
Messages
42
The iSCSI service requires RAM. The middleware requires RAM. ZFS requires RAM. 8GB is the minimum. And as things evolve they require more RAM. Otherwise we would all still be using 2MB of RAM and 16 color graphics.
I would recommend you go back to a version of FreeNAS that you know works with what you are willing to provide it. As you said 9.2 worked with it. Or switch to Nexenta. Just rewriting what you said basically.

Thank you for your information. I have used computer for long enough to see and feel the evolution. Even now a day, DRAM is cheap, bugging codes (still) can consume all of your available RAM, for server/embedded solution which is designed to run non-stop, it is a serious problem. My VMWare system still have lot of free RAM that I can assign to FreeNAS but I don't think it is a right way to help the developers eliminate the bug.
The reasons I choose FreeNAS instead of NAS4free or others:
  1. FreeNAS is a free & open-source solution
  2. FreeNAS team want to modernize the FreeNAS.
  3. As I use the product, I can test it and help other people. It is just a small part I can contribute back to the community, other wise I don't waste my time to keep reporting bug from 9.3 to 9.10 like this. Just pop Nexenta or Quantastor VM up and voila, done!
Actually on 9.3, after some some buggy updates, it run fine till the very recent 9.10 update (few first update of 9.10 is OK also). The problem I met, other guys also met too and their system come with much more resources than mine, OK?
To be honest, I have read many topics on FreeNAS forum, some experienced members here tend to think other people as dump or no-experience guys w/o hand-on knowledge about server system. It is really sad for a community. It also reminds me about my experience in the past with a high-level executive of SCO: when I asked about Linux, he said it is trash! The outcome today, everybody know :)
 

abcslayer

Dabbler
Joined
Dec 9, 2014
Messages
42
I got another NOP-Out in the time that 02 VMs was doing database back up.
The ARC is 5.4-5.5GB (reduced 500MB compare to yesterday).
Top command result:
Code:
last pid: 22610;  load averages:  0.05,  0.15,  0.20    up 2+20:40:12  14:44:39
34 processes:  1 running, 33 sleeping                                   
CPU:  2.0% user,  0.0% nice,  0.2% system,  0.0% interrupt, 97.7% idle  
Mem: 88M Active, 1461M Inact, 6219M Wired, 48M Cache, 111M Free         
ARC: 5616M Total, 1852M MFU, 3352M MRU, 290K Anon, 275M Header, 137M Other
Swap: 8192M Total, 60M Used, 8131M Free


I found devd use a huge amount of RAM, with this information (http://www.freebsd.org/cgi/man.cgi?devd(8)), I think devd should not use such a lot.
Code:
1154 root 1  20 0 1173M 1160M select 0 0:14 0.00% devd


I checked my first post, devd consumed huge RAM amount at that moment too:
Code:
1143 root 1 20 0 3705M 3622M select 1 0:45 0.00% devd
 
Last edited:

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
My VMWare system still have lot of free RAM that I can assign to FreeNAS but I don't think it is a right way to help the developers eliminate the bug.

abcslayer said:
Actually on 9.3, after some some buggy updates, it run fine till the very recent 9.10 update (few first update of 9.10 is OK also). The problem I met, other guys also met too and their system come with much more resources than mine, OK?

We may have misunderstood you, thinking that you just wanted to get your system running smoothly. In that case, it makes sense to add more memory to your FreeNAS VM; after all, you're running the minimum recommended amount. Like you, I experienced iSCSI timeouts with previous versions of FreeNAS but am currently unable to reproduce these problems. But note that I'm using twice as much memory as you are. If, as you mentioned, others with more memory are experiencing timeouts, then the problem is one of those intermittent, hard-to-diagnose bugs that drive developers crazy. :smile:

As far as I know, the developers don't read this subforum, so if your actual goal is to help them improve FreeNAS, why don't you open a bug report? That would draw the attention of the development team and perhaps lead to a definitive solution.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
after all, you're running the minimum recommended amount.

No, he's not.

The minimum recommended for iSCSI *for any use* is 16GB. The minimum recommended for iSCSI where performance is a concern is 32GB.

Please see the manual. http://doc.freenas.org/9.10/freenas_intro.html#ram

Because he wants performance, the minimum required is 32GB. The OP is running 1/4th the recommended RAM. I've *never* been able to get good performance out of iSCSI for any real world scenario with less than 16GB.

Quite frankly this is not expected to work well. He's running the bare minimum memory that's expected to make a light duty general purpose fileserver work right. And he wants to do demanding, difficult work with it. This will not end well. And, as evidenced in this thread, has not ended well.
 

abcslayer

Dabbler
Joined
Dec 9, 2014
Messages
42
Well, I filed the bug here: https://bugs.freenas.org/issues/16170
Actually I built this system since 9.2, at that moment there is nothing about minimum requirements for iSCSI implementation and the memory hogging feature is dedup which I don't use. My pool is purely SSD based (strip+mirror) so I dont think small ARC (6.0-6.5GB) could cause the system perform even worse than a single SSD.
The reason I look at FreeNAS and other NAS appliance is I do not want to be looked down with any RAID controller.
And my system iSCSI connection is virtual (it means all networking activities happened in VMWare memory), it would not be the reason to limit the performance. The only thing that keep me stay here on this forum and posting till now is I believe iSCSI kernel module has problem and in 9.10 there are some memory leakage issue too (I think devd).
If everything could be solved by pop-in more DRAM, we do not need good coders, right? we also do not need to rely on BSD for some IO heavily tasks, keep using Windows Server then :)
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
No, he's not.
The minimum recommended for iSCSI *for any use* is 16GB. The minimum recommended for iSCSI where performance is a concern is 32GB.
Good point. Hadn't read the RAM section in over a year, but both the 9.3 and 9.10 docs state:
If you plan to use iSCSI, install at least 16GB of RAM, if performance is not critical, or at least 32GB of RAM if performance is a requirement.
So, @abcslayer, you really ought to, at a minimum, double your RAM to 16GB and see if the timeouts persist. If they do, double it again to 32GB. If you still have timeouts with 32GB of RAM, definitely file a bug report.

I doubt the developers will pay any attention to a bug report involving an iSCSI user with only 8GB of RAM.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I don't use dedup, my IO requirement is simple: 02 Windows Server 2012 running small SQL databases. So 8GB for FreeNAS I think more than enough (actually it is pretty fine with 9.2 and other NAS OS like Nexenta, Quantastor,....

Yeah, we get that all the time. But really, the nature of block storage on any CoW filesystem tends to be very challenging for planning. Performance plummets with the age of the pool as fragmentation develops, and ZFS doesn't have the insight into what's being stored to be smart about it.

https://forums.freenas.org/index.ph...res-more-resources-for-the-same-result.28178/

So what will tend to happen is that when you start out with a fresh pool on 8GB and iSCSI is that it'll seem very fast for a little while, until the pool starts to fill and fragment, at which point it all goes to fsck because the system is now actually having to do a little bit of work, and has almost no resources to do that. This happens with other ZFS systems as well. This is a ZFS problem, not a FreeNAS problem.

Actually on 9.3, after some some buggy updates, it run fine till the very recent 9.10 update (few first update of 9.10 is OK also). The problem I met, other guys also met too and their system come with much more resources than mine, OK?
To be honest, I have read many topics on FreeNAS forum, some experienced members here tend to think other people as dump or no-experience guys w/o hand-on knowledge about server system. It is really sad for a community. It also reminds me about my experience in the past with a high-level executive of SCO: when I asked about Linux, he said it is trash! The outcome today, everybody know :)

The problem is that ZFS works differently than lots of other things, and your experience with other IT products doesn't cleanly translate. You "think" 8GB should be "more than enough" but it isn't.

@Nick2253 posted this appropriate summary:

One of the things that I see a lot here is that people try to take their experience from previous IT endeavors and apply it to building a FreeNAS machine. It's really, really important that you check this experience at the door.

ZFS is a crazy beast, which does things that are far outside the "normal" realm of what most server applications or client applications do. This is why it's really important to double check all assumptions, or else you're going to have a bad time. [...] ZFS's need for memory boggles the mind, especially in comparison for what a Windows File server would need. [...]

All this isn't just because ZFS is inefficient; far from it. It is because ZFS does everything it can to protect your data from corruption. Right now, there is no alternative on the market with the same level of redundancy and features as ZFS. But that does come with a huge performance hit, relative to what other solutions can provide.

Honestly, I've been here for five years and there's an amazing number of people who come here and assume their IT experience immediately qualifies them as FreeNAS masters, capable of deciding that the rules and recommendations are meant "for someone else." They're not. They're meant for you.

And your attitude:

some experienced members here tend to think other people as dump or no-experience guys w/o hand-on knowledge about server system. It is really sad for a community.

That's really just not called for. We're trying to help you.
 

abcslayer

Dabbler
Joined
Dec 9, 2014
Messages
42
Well, actually before I built my system I read lot of posts from you @jgreco and cyberjock and other members.
Even the choice of strip+mirror is a hard choice too (RAIDZ2 is another option).
I will increase RAM amount, but to be honest, in that case it just make the bug harder to spot out, the logic is: If cache is not enough -> FreeNAS have to access SSD directly -> the system performance should be very close to 1 single SSD and at least there would be no latency problem (I use purely SSD). In fact, it is totally different. That's the thing I want to point out here.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, actually before I built my system I read lot of posts from you @jgreco and cyberjock and other members.
Even the choice of strip+mirror is a hard choice too (RAIDZ2 is another option).
I will increase RAM amount, but to be honest, in that case it just make the bug harder to spot out, the logic is: If cache is not enough -> FreeNAS have to access SSD directly -> the system performance should be very close to 1 single SSD and at least there would be no latency problem (I use purely SSD). In fact, it is totally different. That's the thing I want to point out here.

Yeah, no, the performance isn't really likely to be "very close to 1 single SSD." I know why you'd hope for that but it's not really correct.

You start dealing with all sorts of side effects of having many layers in between the consumer of the disk resource (a VM) which has to pass the read to ESXi which has to then figure out where on a datastore it is and then pass a block read request over the network to the filer which has to map that to a spot on the pool and perform the read, and then all the way back.

There are numerous things that accelerate the process, but many of them involve the word "cache." Others involve luck or sequential nature of the data being accessed.

I don't know if there's a "bug" or not. Maybe there is. Try going back to 9.3 and see. It's very possible you just happened to update at around the same time fragmentation on your pool took a turn for the worse, and you suddenly noticed. If there's actually a bug, then that's also good to know.

However, regardless of whether or not there's a bug, I am telling you flat out that I've never seen a stable iSCSI system running significant traffic on an 8GB system. I'm sorry. That's not the way I'd wish it to be. It's an observation, based on being probably *the* guy who's looked at more iSCSI on FreeNAS than anyone else here on the forums.

The comments I make in this post are still spot-on today:

https://forums.freenas.org/index.php?threads/esxi-5-5-datastore.23165/#post-140043

because ZFS is still the big fat pig it was in 2014.
 

abcslayer

Dabbler
Joined
Dec 9, 2014
Messages
42
Thank you for your inspiring comment.
By the way, devd memory bug is confirmed.
Today my devd consume:
Code:
 1154 root          1  20    0  1529M  1515M select  0   0:18   0.00% devd 

Seriously, I feel QA test of FreeNAS team has problem since 9.3.

---------------------
Few more things to consider:
  1. My SSD pool is over-provisioning (I never use up their space, I know the issue of long-run SSDs)
  2. Because it is purely flash, fragmentation should not be a huge penalty.
  3. My ARC hit rate is normally quite good: 60-75% (because main IOs happens with our databases which are quite small)
 
Last edited:

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Seriously, I feel QA test of FreeNAS team has problem since 9.3.
I always thought of FreeNAS akin to Fedora, where Fedora is the free and "bleeding edge" version of Red Hat.
While we are able to get the free version we also take a partial role in QA'ing the product and are the "bleeding edge" for TrueNAS.

Also, keep in mind that they are taking on two different branches (9.10 and 10.x).
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Thank you for your inspiring comment.
By the way, devd memory bug is confirmed.
Today my devd consume:
Code:
 1154 root          1  20    0  1529M  1515M select  0   0:18   0.00% devd 

Seriously, I feel QA test of FreeNAS team has problem since 9.3.

FreeNAS didn't write devd. I don't know what's causing the issue, but it is entirely possible that the issue was introduced into FreeBSD. I wouldn't blame FreeNAS for FreeBSD bugs. (By the way, you realize that YOU and the rest of us are the QA testers on the FreeNAS team, yes?)

Obviously something's wrong there and I encourage you to file a bug report.

Because it is purely flash, fragmentation should not be a huge penalty.

You'd think. But flash is a funny thing. I've found things work better with ashift=13, among other issues.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
From @abcslayer's bug report:
Alexander Motin said:
We do not officially support FreeNAS running inside VM, and it is difficult to diagnose something based on provided information. Mentioned devd memory leak was indeed found and fixed in 9.10 recently. If as you write it consumed 3.5GB of memory, then it indeed can create performance problems, considering that your system is already at lower recommended mark on RAM size. It should be fixed in next nightly build and next release.
 

abcslayer

Dabbler
Joined
Dec 9, 2014
Messages
42
Well that guy has editted his comment in my bug report, from the moment I filed the bug report till now, there is no newer update/patch and devd bug is still there. (his original comment said something about nightlies release and in-comming update).
After 12 hours of running, my NAS with 16GB of RAM start getting IO deteriorated, like this:
Device t10.FreeBSD_iSCSI_Disk______000c29e873c1000________________ performance has deteriorated. I/O latency increased from average value of 2149 microseconds to 299519 microseconds.
7/3/2016 3:03:02 AM

Devd memory today:
Code:
last pid: 80607;  load averages:  0.48,  0.39,  0.34    up 1+22:04:09  11:52:38
36 processes:  1 running, 35 sleeping                                         
CPU:  2.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 97.6% idle        
Mem: 62M Active, 1127M Inact, 14G Wired, 216M Free                            
ARC: 13G Total, 11G MFU, 792M MRU, 432K Anon, 783M Header, 330M Other         
Swap: 8192M Total, 230M Used, 7962M Free, 2% Inuse                            
                                                                              
  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND 

1155 root          1  20    0   793M   740M select  0   0:09   0.00% devd 


I updated here so anyone who face trouble with memory can find which process to kill
 
Status
Not open for further replies.
Top