Chris Tobey
Contributor
- Joined
- Feb 11, 2014
- Messages
- 114
Hi guys,
I am hoping someone has some insight into an issue I have been seeing for the last week since I updated to 11.2 U5.
Hardware:
Chassis: SuperMicro 6048R-E1CR24N
Motherboard: SuperMicro x10DRi-T4+
CPU: 2 x Intel E5-2620v3 (6 core/12 thread @ 2.4GHz per)
RAM: 12 x SuperMicro DR416L-SL01-ER21 (Samsung, 16GB DDR4 2133 ECC Registered)
HBA: SuperMicro AOC-S3108L-H8IR-O-P (LSI MegaRAID 8-port SAS 12Gb/s)
Network: ChelsIO T420-LL-CR (2 x SFP+ for 10G Fibre)
Boot: 2 x Samsung 840 Pro 128GB
Pool: 3 x RAIDZ2 vdevs of 6x8TB Seagate HDDs.
Configuration:
FreeNAS 11.2 U5 connected to Active Directory.
Three datasets in zpool shared via SMB and NFS.
As to my issue - I upgraded from 11.1 U6 to 11.2 U5 one week ago and in that time half of the system has locked up on four separate occasions. SMB access and performance has been good until suddenly the shares are not accessible. I cannot SSH to the system as it keeps reporting a broken pipe, the dashboard shows some of the details but things like the Network Info do not display anything. I can access most of the UI through the webpage, including netdata, which does not appear to show anything out of normal (<5% CPU usage, normal IO).
I can fix the issue by going to Directory Services > Active Directory and unchecking the Enable, then saving, then re-checking Enable and saving again. Similarly, if I am able to get a shell working I can run the following:
The syslog shows nothing when this issue happens, and shows the following when I run the above:
Once the AD is restarted everything appears to return to normal. I can access the SMB shares, the dashboard updates, and life is good.
This is a live system in use by 100+ servers 24/7 via NFS and 100+ users during work hours, so reboots are... difficult, but restarting Samba or modifying some share properties is do-able.
I am hoping someone has some insight into an issue I have been seeing for the last week since I updated to 11.2 U5.
Hardware:
Chassis: SuperMicro 6048R-E1CR24N
Motherboard: SuperMicro x10DRi-T4+
CPU: 2 x Intel E5-2620v3 (6 core/12 thread @ 2.4GHz per)
RAM: 12 x SuperMicro DR416L-SL01-ER21 (Samsung, 16GB DDR4 2133 ECC Registered)
HBA: SuperMicro AOC-S3108L-H8IR-O-P (LSI MegaRAID 8-port SAS 12Gb/s)
Network: ChelsIO T420-LL-CR (2 x SFP+ for 10G Fibre)
Boot: 2 x Samsung 840 Pro 128GB
Pool: 3 x RAIDZ2 vdevs of 6x8TB Seagate HDDs.
Configuration:
FreeNAS 11.2 U5 connected to Active Directory.
Three datasets in zpool shared via SMB and NFS.
As to my issue - I upgraded from 11.1 U6 to 11.2 U5 one week ago and in that time half of the system has locked up on four separate occasions. SMB access and performance has been good until suddenly the shares are not accessible. I cannot SSH to the system as it keeps reporting a broken pipe, the dashboard shows some of the details but things like the Network Info do not display anything. I can access most of the UI through the webpage, including netdata, which does not appear to show anything out of normal (<5% CPU usage, normal IO).
I can fix the issue by going to Directory Services > Active Directory and unchecking the Enable, then saving, then re-checking Enable and saving again. Similarly, if I am able to get a shell working I can run the following:
Code:
root@fileserver:~ # /etc/directoryservice/ActiveDirectory/ctl start False True Join is OK False True
The syslog shows nothing when this issue happens, and shows the following when I run the above:
Code:
Aug 13 20:12:41 fileserver ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs Aug 13 20:12:43 fileserver ActiveDirectory: /usr/sbin/service ix-hostname quietstart Aug 13 20:12:43 fileserver ActiveDirectory: /usr/sbin/service ix-kerberos quietstart default LOCATION.DOMAIN.COM Aug 13 20:12:43 fileserver ActiveDirectory: /usr/sbin/service ix-nsswitch quietstart Aug 13 20:12:43 fileserver ActiveDirectory: /usr/sbin/service ix-ldap quietstart Aug 13 20:12:43 fileserver ActiveDirectory: /usr/sbin/service ix-kinit quietstart Aug 13 20:12:45 fileserver ActiveDirectory: /usr/sbin/service ix-kinit status Aug 13 20:12:45 fileserver ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs Aug 13 20:12:50 fileserver ActiveDirectory: /usr/sbin/service ix-activedirectory quietstart Aug 13 20:12:53 fileserver ActiveDirectory: /usr/sbin/service ix-activedirectory status Aug 13 20:12:55 fileserver ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.stop cifs Aug 13 20:12:56 fileserver kernel: Failed to fully fault in a core file segment at VA 0x819849000 with size 0x11000 to be written at offset 0x63f2000 for process smbd Aug 13 20:12:56 fileserver kernel: Failed to fully fault in a core file segment at VA 0x819849000 with size 0x11000 to be written at offset 0x63f2000 for process smbd Aug 13 20:12:56 fileserver kernel: pid 10232 (smbd), uid 0: exited on signal 6 (core dumped) Aug 13 20:12:56 fileserver ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.start cifs Aug 13 20:13:01 fileserver ActiveDirectory: /usr/sbin/service ix-pam quietstart Aug 13 20:13:01 fileserver ActiveDirectory: /usr/local/bin/python /usr/local/bin/midclt call notifier.cachetool fill
Once the AD is restarted everything appears to return to normal. I can access the SMB shares, the dashboard updates, and life is good.
This is a live system in use by 100+ servers 24/7 via NFS and 100+ users during work hours, so reboots are... difficult, but restarting Samba or modifying some share properties is do-able.