Samba crashes frequently under load on FreeNAS-11.0-RC3 (1d8ee7925)

Status
Not open for further replies.

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
Hi All,

This issue actually started once the latest 9.10 U4 release came out but I thought an upgrade to FreeNAS 11 would have fixed it but it doesn't

When you copy files from Windows to the shared folder it works fine but after about 3 minutes or so the share drops and you get a network error on the client. In FreeNAS you get an error saying

"We tried 1 attempts to recover service activedirectory" and in the shell footer the message below in the code snippet appears

I tried a search and the only other result for this was because someone had dedup enabled I do not. The service always restarts itself and everything comes back online again but you have to click retry on the client machine all the time which is really annoying.

I am currently trying to fill the NAS from another machine and it only craps out when the transfer is running at full tilt 133MB/s of the 1GB LAN connection.

I can't roll back now as I upgraded the pool (My Fault I know) but I just cannot seem to get to the bottom of this issue.

Other points of note is that the FreeNAS Server is connected to AD and uses AD Users to authenticate.

TL;DR Samba / ActiveDirectory restarts constantly when copying things to the NAS from Windows.

Code:
May 31 11:57:04 Vault /adtool: [common.pipesubr:66] Popen()ing: /usr/bin/kinit --renewable --password-file=/tmp/tmp61caxaem FreeNASAdmin@MyDomain.UK
May 31 11:57:05 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 11:57:05 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default MyDomain.UK
May 31 11:57:07 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart
May 31 11:57:07 Vault ActiveDirectory: /usr/sbin/service ix-ldap quietstart
May 31 11:57:07 Vault ActiveDirectory: /usr/sbin/service ix-kinit quietstart
May 31 11:57:09 Vault ActiveDirectory: /usr/sbin/service ix-kinit status
May 31 11:57:09 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 11:57:12 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart
May 31 11:57:16 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory status
May 31 11:57:18 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 11:57:19 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 11:57:23 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstart
May 31 11:57:23 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstart &
May 31 12:01:44 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 12:01:46 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstop
May 31 12:01:46 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstop
May 31 12:01:47 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstop
May 31 12:01:47 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory forcestop
May 31 12:01:50 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstop &
May 31 12:01:52 Vault ActiveDirectory: /usr/sbin/service samba_server forcestop
May 31 12:01:52 Vault ActiveDirectory: /usr/sbin/service ix-pre-samba start
May 31 12:01:53 Vault ActiveDirectory: /usr/sbin/service ix-kinit forcestop
May 31 12:01:53 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 12:01:55 Vault /adtool: [common.pipesubr:66] Popen()ing: /usr/bin/kinit --renewable --password-file=/tmp/tmpfsiwqvme FreeNASAdmin@MyDomain.UK
May 31 12:01:56 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 12:01:56 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default MyDomain.UK
May 31 12:01:58 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart
May 31 12:01:59 Vault ActiveDirectory: /usr/sbin/service ix-ldap quietstart
May 31 12:01:59 Vault ActiveDirectory: /usr/sbin/service ix-kinit quietstart
May 31 12:02:00 Vault ActiveDirectory: /usr/sbin/service ix-kinit status
May 31 12:02:00 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 12:02:04 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart
May 31 12:02:08 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory status
May 31 12:02:09 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 12:02:11 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 12:02:14 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstart
May 31 12:02:14 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstart &
 

Attachments

  • Capture.PNG
    Capture.PNG
    65.4 KB · Views: 337
Last edited by a moderator:

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
I can't roll back now as I upgraded the pool (My Fault I know) but I just cannot seem to get to the bottom of this issue.
Never upgrade the pool unless you're 100% certain you don't intend to go back.

FreeNAS 11-RC4 was just released. You may want to update and see if the issue persists. If it does, open a bug ticket at https://bugs.freenas.org.
 

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
Never upgrade the pool unless you're 100% certain you don't intend to go back.

FreeNAS 11-RC4 was just released. You may want to update and see if the issue persists. If it does, open a bug ticket at https://bugs.freenas.org.

Yeah this was a bad move on my part However the issue appeared in 9.10 U4 , In regards to the FreeNAS 11-RC4 I cannot see that on my update train I am using the FreeNas 11 stable train am I missing something?
 

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
I have done some more analysis on this and can replicate it easily. It seems linked to the value in "AD check connectivity frequency (seconds):" Under Directory > Active Directory if I set this in seconds to 5 minutes it fails every five minutes and if I set it to 3600 seconds which is the maximum value (1 hour) it resets itself every hour . Here is the extract from the /var/log/messages

If there was a way I could disable the connectivity check it would be fine. There is no issues with the domain that I can see, I think I might try and force it to look at one domain controller only see if that replicates but at the moment the connection resets every hour which although is a pain in the a$$ is better than every 5 minutes.

Would increasing the "AD Timeout" Value below it make any difference? What seems to happen is whatever is happening with Samba gets terminated client gets an error but when they reconnect 10 seconds later it works again for another hour

Open to suggestions


Code:
May 31 14:45:30 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstop
May 31 14:45:30 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstop
May 31 14:45:30 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstop
May 31 14:45:30 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory forcestop
May 31 14:45:32 Vault ActiveDirectory: activedirectory_stop: leaving domain
May 31 14:45:34 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstop &
May 31 14:45:35 Vault ActiveDirectory: /usr/sbin/service samba_server forcestop
May 31 14:45:36 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 14:45:38 Vault ActiveDirectory: /usr/sbin/service ix-kinit forcestop
May 31 14:45:38 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 14:45:40 Vault /adtool: [common.pipesubr:66] Popen()ing: /usr/bin/kinit --renewable --password-file=/tmp/tmp_cxn1ay6 FreeNASAdmin@MyDomain.UK
May 31 14:45:41 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 14:45:42 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 14:45:42 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default MyDomain.UK
May 31 14:45:44 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart
May 31 14:45:44 Vault ActiveDirectory: /usr/sbin/service ix-ldap quietstart
May 31 14:45:44 Vault ActiveDirectory: /usr/sbin/service ix-kinit quietstart
May 31 14:45:45 Vault ActiveDirectory: kerberos_start: /usr/bin/kinit --renewable --password-file=/tmp/tmp.ZISs1G0e FreeNASAdmin@MyDomain.UK
May 31 14:45:46 Vault ActiveDirectory: kerberos_start: Successful
May 31 14:45:46 Vault ActiveDirectory: /usr/sbin/service ix-kinit status
May 31 14:45:46 Vault ActiveDirectory: kerberos_status: klist -t
May 31 14:45:46 Vault ActiveDirectory: kerberos_status: Successful
May 31 14:45:46 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 14:45:49 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart
May 31 14:45:51 Vault ActiveDirectory: activedirectory_start: trying to join domain
May 31 14:45:51 Vault ActiveDirectory: AD_join_domain: net -k ads join MyDomain.UK -S WSDC01.MyDomain.UK -p 389
May 31 14:45:53 Vault ActiveDirectory: AD_join_domain: Successful
May 31 14:45:53 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory status
May 31 14:45:54 Vault ActiveDirectory: activedirectory_status: checking status
May 31 14:45:54 Vault ActiveDirectory: AD_status_domain: net -k ads status MyDomain.UK
May 31 14:45:55 Vault ActiveDirectory: AD_status_domain: Okay
May 31 14:45:55 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 14:45:57 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 14:46:01 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstart
May 31 14:46:01 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstart &
May 31 14:46:05 Vault ActiveDirectory: kerberos_status: klist -t
May 31 14:46:05 Vault ActiveDirectory: kerberos_status: Successful
May 31 14:46:05 Vault ActiveDirectory: activedirectory_status: checking status
May 31 14:46:05 Vault ActiveDirectory: AD_status_domain: net -k ads status MyDomain.UK
May 31 14:46:06 Vault ActiveDirectory: AD_status_domain: Okay
May 31 15:47:23 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 15:47:24 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstop
May 31 15:47:25 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstop
May 31 15:47:25 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstop
May 31 15:47:25 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory forcestop
May 31 15:47:27 Vault ActiveDirectory: activedirectory_stop: leaving domain
May 31 15:47:29 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstop &
May 31 15:47:31 Vault ActiveDirectory: /usr/sbin/service samba_server forcestop
May 31 15:47:31 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 15:47:34 Vault ActiveDirectory: /usr/sbin/service ix-kinit forcestop
May 31 15:47:34 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 15:47:36 Vault /adtool: [common.pipesubr:66] Popen()ing: /usr/bin/kinit --renewable --password-file=/tmp/tmpqxqbj2ke FreeNASAdmin@MyDomain.UK
May 31 15:47:37 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 15:47:39 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 15:47:40 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default MyDomain.UK
May 31 15:47:41 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart
May 31 15:47:42 Vault ActiveDirectory: /usr/sbin/service ix-ldap quietstart
May 31 15:47:42 Vault ActiveDirectory: /usr/sbin/service ix-kinit quietstart
May 31 15:47:42 Vault ActiveDirectory: kerberos_start: /usr/bin/kinit --renewable --password-file=/tmp/tmp.3hS3daQt FreeNASAdmin@MyDomain.UK
May 31 15:47:43 Vault ActiveDirectory: kerberos_start: Successful
May 31 15:47:43 Vault ActiveDirectory: /usr/sbin/service ix-kinit status
May 31 15:47:43 Vault ActiveDirectory: kerberos_status: klist -t
May 31 15:47:43 Vault ActiveDirectory: /usr/sbin/service ix-kinit status
May 31 15:47:43 Vault ActiveDirectory: kerberos_status: klist -t
May 31 15:47:43 Vault ActiveDirectory: kerberos_status: Successful
May 31 15:47:43 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 15:47:47 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart
May 31 15:47:49 Vault ActiveDirectory: activedirectory_start: trying to join domain
May 31 15:47:49 Vault ActiveDirectory: AD_join_domain: net -k ads join MyDomain.UK -S WSDC01.MyDomain.UK -p 389
May 31 15:47:51 Vault ActiveDirectory: AD_join_domain: Successful
May 31 15:47:51 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory status
May 31 15:47:51 Vault ActiveDirectory: activedirectory_status: checking status
May 31 15:47:51 Vault ActiveDirectory: AD_status_domain: net -k ads status MyDomain.UK
May 31 15:47:52 Vault ActiveDirectory: AD_status_domain: Okay
May 31 15:47:52 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 15:47:53 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 15:47:57 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstart
May 31 15:47:57 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstart &
May 31 15:48:01 Vault ActiveDirectory: kerberos_status: klist -t
May 31 15:48:01 Vault ActiveDirectory: kerberos_status: Successful
May 31 15:48:02 Vault ActiveDirectory: activedirectory_status: checking status
May 31 15:48:02 Vault ActiveDirectory: AD_status_domain: net -k ads status MyDomain.UK
May 31 15:48:03 Vault ActiveDirectory: AD_status_domain: Okay
May 31 16:49:19 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 16:49:21 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstop
May 31 16:49:21 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstop
May 31 16:49:21 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstop
May 31 16:49:21 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory forcestop
May 31 16:49:23 Vault ActiveDirectory: activedirectory_stop: leaving domain
May 31 16:49:25 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstop &
May 31 16:49:27 Vault ActiveDirectory: /usr/sbin/service samba_server forcestop
May 31 16:49:27 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 16:49:32 Vault ActiveDirectory: /usr/sbin/service ix-kinit forcestop
May 31 16:49:32 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 16:49:34 Vault /adtool: [common.pipesubr:66] Popen()ing: /usr/bin/kinit --renewable --password-file=/tmp/tmp3lqp37kj FreeNASAdmin@MyDomain.UK
May 31 16:49:35 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 16:49:37 Vault ActiveDirectory: /usr/sbin/service ix-hostname quietstart
May 31 16:49:38 Vault ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default MyDomain.UK
May 31 16:49:39 Vault ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart
May 31 16:49:40 Vault ActiveDirectory: /usr/sbin/service ix-ldap quietstart
May 31 16:49:40 Vault ActiveDirectory: /usr/sbin/service ix-kinit quietstart
May 31 16:49:40 Vault ActiveDirectory: kerberos_start: /usr/bin/kinit --renewable --password-file=/tmp/tmp.3YBXqluD FreeNASAdmin@MyDomain.UK
May 31 16:49:41 Vault ActiveDirectory: kerberos_start: Successful
May 31 16:49:41 Vault ActiveDirectory: /usr/sbin/service ix-kinit status
May 31 16:49:42 Vault ActiveDirectory: kerberos_status: klist -t
May 31 16:49:42 Vault ActiveDirectory: kerberos_status: Successful
May 31 16:49:42 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 16:49:45 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart
May 31 16:49:47 Vault ActiveDirectory: activedirectory_start: trying to join domain
May 31 16:49:47 Vault ActiveDirectory: AD_join_domain: net -k ads join MyDomain.UK -S WSDC01.MyDomain.UK -p 389
May 31 16:49:49 Vault ActiveDirectory: AD_join_domain: Successful
May 31 16:49:49 Vault ActiveDirectory: /usr/sbin/service ix-activedirectory status
May 31 16:49:49 Vault ActiveDirectory: activedirectory_status: checking status
May 31 16:49:49 Vault ActiveDirectory: AD_status_domain: net -k ads status MyDomain.UK
May 31 16:49:50 Vault ActiveDirectory: AD_status_domain: Okay
May 31 16:49:50 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
May 31 16:49:52 Vault ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
May 31 16:49:55 Vault ActiveDirectory: /usr/sbin/service ix-pam quietstart
May 31 16:49:55 Vault ActiveDirectory: /usr/sbin/service ix-cache quietstart &
May 31 16:49:59 Vault ActiveDirectory: kerberos_status: klist -t
May 31 16:49:59 Vault ActiveDirectory: kerberos_status: Successful
May 31 16:50:00 Vault ActiveDirectory: activedirectory_status: checking status
May 31 16:50:00 Vault ActiveDirectory: AD_status_domain: net -k ads status MyDomain.UK
May 31 16:50:01 Vault ActiveDirectory: AD_status_domain: Okay

 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Try setting the following parameter under "services" -> "SMB". smb2 leases = no. There is currently a bug in the oplock code in samba that can cause smbd processes to panic in certain situations involving smb2 leases.
 
Last edited:

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
Try setting the following parameter under "services" -> "SMB". smb2 leases = no. There is currently a bug in the oplock code in samba that can cause smbd processes to panic in certain situations involving smb2 leases.

Thanks for your reply.

I hope I am not being stupid here but I cant see that command anywhere am I missing something?

Here is what I see
 

Attachments

  • smb.png
    smb.png
    39.6 KB · Views: 345

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
Ahh sorry get ya,

like this ??
 

Attachments

  • bbbb.PNG
    bbbb.PNG
    14.4 KB · Views: 360

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
done and done. will report back either way. I set the AD timeout to 300 should I put this back to 60 or leave it as is?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Your logging level is a bit too verbose for me to get a high-level view of what's going on in Samba. Try setting logging to "normal", reproduce the problem, then send me another debug.
 

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
Your logging level is a bit too verbose for me to get a high-level view of what's going on in Samba. Try setting logging to "normal", reproduce the problem, then send me another debug.
Hi mate, I have sent you a new debug with the less verbose logging thanks again
 

Crm

Dabbler
Joined
Sep 14, 2015
Messages
27
Hi there, Yeah it is a VM but the boot device and the drive pool are all on passed through hardware so should be pretty close to being bare metal but I have an identical setup I can replicate it on if needed. I tried a similar build and it had same issue but I can do a bare metal install on the same machine to replicate if need be
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
You might want to look at the bugs/issues for 11.0 RC3 and RC4 on bugs.freenas.org

IIRC, there have been several AD issues, as well as timeout problems.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
You might want to look at the bugs/issues for 11.0 RC3 and RC4 on bugs.freenas.org

IIRC, there have been several AD issues, as well as timeout problems.

I don't think any exactly mirror what he's describing. Especially since he upgrade to FN11 to try to fix the problem. If he can reproduce it on bare metal, I'll punt the issue to the bugtracker. :D
 
Status
Not open for further replies.
Top