after upgrade to 11.1-U5 CIFS shares periodically stop and start

Status
Not open for further replies.

mmontgom

Cadet
Joined
Jan 26, 2018
Messages
8
We updated to U5 on Sunday 11/24 and after the update the cifs shares continually drop out. Over the past 20 hours we have lost the cifs shares 139 times.

The most noticeable pattern is that we will see a
Jun 28 15:06:23 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs

log entry every 20 to 30 minutes followed by three of the same log entries every 30 seconds or so.

We have "Enable Monitoring" set in the Active Directory configuration. We see in the logs that FreeNAS is checking to see if AD is there and it will indicate success. Then, all of a sudden, you will see the "call notifier.stop cifs" and we lose the shares until it restarts. This is severely impacting our operation. Our plan is to roll back to U4 after hours today and see if this stops the bouncing.

representative /var/log/messages snippet follows:
Jun 28 15:21:36 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 15:21:36 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 15:21:36 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 15:21:36 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 15:21:37 _NAS ActiveDirectory: AD_status_domain: Okay
Jun 28 15:23:00 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 15:23:00 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 15:23:00 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 15:23:01 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 15:23:02 _NAS ActiveDirectory: AD_status_domain: Okay
Jun 28 15:23:02 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
Jun 28 15:23:04 _NAS ActiveDirectory: /usr/sbin/service samba_server forcestop
Jun 28 15:23:04 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
Jun 28 15:23:08 _NAS ActiveDirectory: /usr/sbin/service ix-kinit forcestop
Jun 28 15:23:08 _NAS ActiveDirectory: /usr/sbin/service ix-hostname quietstart
Jun 28 15:23:08 _NAS ActiveDirectory: /usr/sbin/service ix-kerberos restart
Jun 28 15:23:09 _NAS ActiveDirectory: /usr/sbin/service ix-nsswitch quietstop
Jun 28 15:23:09 _NAS ActiveDirectory: /usr/sbin/service ix-pam quietstop
Jun 28 15:23:10 _NAS ActiveDirectory: /usr/sbin/service ix-cache quietstop &
Jun 28 15:24:33 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
Jun 28 15:24:35 _NAS ActiveDirectory: /usr/sbin/service ix-hostname quietstart
Jun 28 15:24:36 _NAS ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default EXAMPLE.LOCAL
Jun 28 15:24:39 _NAS ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart
Jun 28 15:24:39 _NAS ActiveDirectory: /usr/sbin/service ix-ldap quietstart
Jun 28 15:24:39 _NAS ActiveDirectory: /usr/sbin/service ix-kinit quietstart
Jun 28 15:24:40 _NAS ActiveDirectory: kerberos_start: /usr/bin/kinit --renewable --password-file=/tmp/tmp.w31R8pcE Administrator@EXAMPLE.LOCAL
Jun 28 15:24:44 _NAS ActiveDirectory: kerberos_start: Successful
Jun 28 15:24:44 _NAS ActiveDirectory: /usr/sbin/service ix-kinit status
Jun 28 15:24:44 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 15:24:44 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 15:24:44 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
Jun 28 15:24:49 _NAS ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart
Jun 28 15:24:52 _NAS ActiveDirectory: activedirectory_start: checking if we are joined already
Jun 28 15:24:52 _NAS ActiveDirectory: AD_testjoin_domain: net -k ads testjoin Example.local -S mr-slave.example.local -p 389
Jun 28 15:24:53 _NAS ActiveDirectory: AD_testjoin_domain: Successful
Jun 28 15:24:53 _NAS ActiveDirectory: activedirectory_start: skipping join, already joined
Jun 28 15:24:53 _NAS ActiveDirectory: /usr/sbin/service ix-activedirectory status
Jun 28 15:24:53 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 15:24:53 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 15:24:54 _NAS ActiveDirectory: AD_status_domain: Okay
Jun 28 15:24:54 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
Jun 28 15:24:56 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
Jun 28 15:25:00 _NAS ActiveDirectory: /usr/sbin/service ix-pam quietstart
Jun 28 15:25:02 _NAS ActiveDirectory: /usr/sbin/service ix-cache quietstart &
Jun 28 15:25:13 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 15:25:13 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 15:25:13 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 15:25:13 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 15:25:14 _NAS ActiveDirectory: AD_status_domain: Okay
Jun 28 15:26:26 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 15:26:26 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 15:26:27 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 15:26:27 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 15:26:28 _NAS ActiveDirectory: AD_status_domain: Okay

Your guidance is appreciated....
 
D

dlavigne

Guest
Does the behavior stop if you uncheck Enable Monitoring? We have found that in some environments that works and are working on a more elegant solution for those environments.
 

mmontgom

Cadet
Joined
Jan 26, 2018
Messages
8
We unchecked Enable Monitoring and it ran fine (though log was quite chatty with AD status and kerberos status messages) until we started accessing the shares and we saw the "notifier.stop cifs" entry and it stayed stopped.

Log snippet:
Jun 28 19:41:33 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 19:41:33 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 19:41:34 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 19:41:34 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 19:41:35 _NAS ActiveDirectory: AD_status_domain: Okay
Jun 28 19:41:37 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 19:41:37 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 19:41:38 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 19:41:38 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 19:41:39 _NAS ActiveDirectory: AD_status_domain: Okay
Jun 28 19:41:41 _NAS ActiveDirectory: kerberos_status: klist -t
Jun 28 19:41:41 _NAS ActiveDirectory: kerberos_status: Successful
Jun 28 19:41:42 _NAS ActiveDirectory: activedirectory_status: checking status
Jun 28 19:41:42 _NAS ActiveDirectory: AD_status_domain: net -k ads status Example.local
Jun 28 19:41:43 _NAS ActiveDirectory: AD_status_domain: Okay
Jun 28 19:41:43 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
Jun 28 19:41:45 _NAS ActiveDirectory: /usr/sbin/service samba_server forcestop
Jun 28 19:41:46 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
Jun 28 19:41:49 _NAS ActiveDirectory: /usr/sbin/service ix-kinit forcestop
Jun 28 19:41:49 _NAS ActiveDirectory: /usr/sbin/service ix-hostname quietstart
Jun 28 19:41:50 _NAS ActiveDirectory: /usr/sbin/service ix-kerberos restart
Jun 28 19:41:50 _NAS ActiveDirectory: /usr/sbin/service ix-nsswitch quietstop
Jun 28 19:41:50 _NAS ActiveDirectory: /usr/sbin/service ix-pam quietstop
Jun 28 19:41:52 _NAS ActiveDirectory: /usr/sbin/service ix-cache quietstop &
 
D

dlavigne

Guest
Please create a report at bugs.freenas.org and post the issue number here.
 

mmontgom

Cadet
Joined
Jan 26, 2018
Messages
8
We booted into U4 and it seems the issue is still present.

seems smbd is crashing for some reason. How do you make the logs more verbose?

Jun 29 10:37:58 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
Jun 29 10:38:00 _NAS ActiveDirectory: /usr/sbin/service ix-kerberos quietstop
Jun 29 10:38:00 _NAS ActiveDirectory: /usr/sbin/service ix-nsswitch quietstop
Jun 29 10:38:01 _NAS ActiveDirectory: /usr/sbin/service ix-pam quietstop
Jun 29 10:38:02 _NAS ActiveDirectory: /usr/sbin/service ix-activedirectory forcestop
Jun 29 10:38:06 _NAS ActiveDirectory: /usr/sbin/service ix-cache quietstop &
Jun 29 10:38:09 _NAS ActiveDirectory: /usr/sbin/service samba_server forcestop
Jun 29 10:38:09 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
Jun 29 10:38:13 _NAS ActiveDirectory: /usr/sbin/service ix-kinit forcestop
Jun 29 10:38:13 _NAS ActiveDirectory: /usr/sbin/service ix-hostname quietstart
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x814d58000 with size 0x7000 to be written at offset 0x35c0000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x814d58000 with size 0x7000 to be written at offset 0x35c0000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81a6b1000 with size 0x9000 to be written at offset 0x88f9000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81a6b1000 with size 0x9000 to be written at offset 0x88f9000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81cc0a000 with size 0x7000 to be written at offset 0x8935000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81cc0a000 with size 0x7000 to be written at offset 0x8935000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81d217000 with size 0xf2000 to be written at offset 0x893f000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81d217000 with size 0xf2000 to be written at offset 0x893f000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81d309000 with size 0xd000 to be written at offset 0x8a31000 for process smbd
Jun 29 10:38:14 _NAS kernel: Failed to fully fault in a core file segment at VA 0x81d309000 with size 0xd000 to be written at offset 0x8a31000 for process smbd
Jun 29 10:38:14 _NAS kernel: pid 2922 (smbd), uid 0: exited on signal 6 (core dumped)
Jun 29 10:39:55 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
Jun 29 10:39:55 _NAS smbd: in openpam_check_error_code(): pam_sm_setcred(): unexpected return value 12
Jun 29 10:39:55 _NAS smbd: in openpam_check_error_code(): pam_sm_setcred(): unexpected return value 12
Jun 29 10:39:57 _NAS ActiveDirectory: /usr/sbin/service ix-hostname quietstart
Jun 29 10:39:58 _NAS ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default EXAMPLE.LOCAL
Jun 29 10:40:00 _NAS ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart
Jun 29 10:40:01 _NAS ActiveDirectory: /usr/sbin/service ix-ldap quietstart
Jun 29 10:40:01 _NAS ActiveDirectory: /usr/sbin/service ix-kinit quietstart
Jun 29 10:40:05 _NAS ActiveDirectory: /usr/sbin/service ix-kinit status
Jun 29 10:40:06 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
Jun 29 10:40:11 _NAS ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart
Jun 29 10:40:16 _NAS ActiveDirectory: /usr/sbin/service ix-activedirectory status
Jun 29 10:40:17 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs
Jun 29 10:40:19 _NAS ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs
Jun 29 10:40:25 _NAS ActiveDirectory: /usr/sbin/service ix-pam quietstart
Jun 29 10:40:26 _NAS ActiveDirectory: /usr/sbin/service ix-cache quietstart &
 

mmontgom

Cadet
Joined
Jan 26, 2018
Messages
8
Can't login to bugs.freenas.org to submit bug report even after resetting password
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
That looks like service monitoring is still enabled. Try the following:
sqlite3 /data/freenas-v1.db 'UPDATE services_servicemonitor SET sm_enable=0 WHERE sm_name="activedirectory";'
service middlewared restart
tail -f /var/log/middlewared.log

The middlewared log should indicate that AD is no longer being monitored.
 

mmontgom

Cadet
Joined
Jan 26, 2018
Messages
8
That fixed it. Thank you!

Service Monitoring has been enabled since we installed 11.1-U1. What was the cause for needing to turn of Service Monitoring?

We noticed that the SMB share instability appeared when we installed U5, however when we reverted to U4, the issue persisted.

At the same time we updated to U5, we ran Windows Update on our servers which included both our Active Directory domain controllers.

In the update "2018-06 Security Monthly Quality Rollup for Windows Server 2012 R2 for x64-based Systems (KB4284815)" there were "Security updates to Internet Explorer, Windows apps, remote code execution, Windows Server, Windows storage and filesystems, and Windows wireless networking."

Could this update be the cause of the issues we experienced requiring Service Monitoring to be turned off?

What risks exist with Service Monitoring off?
 
Status
Not open for further replies.
Top