kernel: lockd: server [IP] not responding, timed out

Status
Not open for further replies.

FNNewbie

Cadet
Joined
Nov 8, 2013
Messages
5
Hi,

I have FreeNAS 9.1.1 on SuperMicro chassis with 36 disks. I also have about 30 Redhat Linux 5 clients NFS mounting to the FreeNAS box. However, I notice after a while, I start to get

kernel: lockd: server [FreeNAS_IP] not responding, timed out

on the Linux clients whenever access is made to the FreeNAS box. Certain programs just hang if run from the NFS mounts but run fine if on mounts from other NAS storage such as NetApp. Could you advise on troubleshooting this?

Note on FreeNAS:
# ps -ax | grep nfs
2970 ?? Is 0:00.18 nfsd: master (nfsd)
2971 ?? S 218:48.56 nfsd: server (nfsd)
52917 0 S+ 0:00.00 grep nfs
Even though in FreeNAS interface, NFS settings, I have default 4 for "number of servers." I tried to change to higher number but the GUI seems to hang. What's the command line/file to achieve the same? Do I need to restart NFS services?

Thank you.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well. I was seeing that too but hadn't gone to see what was wrong. Interesting.
 

FNNewbie

Cadet
Joined
Nov 8, 2013
Messages
5
Thanks for the comments so far.

regarding the timeout:
I notice if I NFS mount it with nolock option, I can bring up the programs that used to hang in that same NFS mount directory. So it has to do with NFS locking. Further investigation shows the program indeed has trouble obtaining write locks in the NFS-mounted directory. What's the consequence of mounting with nolock? Is it likely to corrupt files since the same volume is mounted/used on multiple NFS clients?

regarding taking forever to apply changes (number of servers) to NFS settings in FreeNAS GUI, I just notice below. I wonder those forcestop did not go through hence GUI problem?
# ps -aux | grep lockd
root 2993 0.0 0.0 14120 1816 ?? Ds 20Nov13 0:53.96 /usr/sbin/rpc.lockd
root 40083 0.0 0.0 14492 2088 ?? I 13Feb14 0:00.03 sh -c (/usr/sbin/service lockd forcestop)
root 40084 0.0 0.0 14492 2756 ?? I 13Feb14 0:00.04 /bin/sh /etc/rc.d/lockd forcestop
root 40399 0.0 0.0 14492 2088 ?? I 13Feb14 0:00.03 sh -c (/usr/sbin/service lockd forcestop)
root 40400 0.0 0.0 14492 2756 ?? I 13Feb14 0:00.04 /bin/sh /etc/rc.d/lockd forcestop
root 86962 0.0 0.0 14492 2088 ?? I 6:53PM 0:00.03 sh -c (/usr/sbin/service lockd forcestop)
root 86963 0.0 0.0 14492 2756 ?? I 6:53PM 0:00.04 /bin/sh /etc/rc.d/lockd forcestop
root 10021 0.0 0.0 16268 1872 0 S+ 10:57AM 0:00.00 grep lockd

I believe when I changed 'number of servers' via WebUI, it starts /usr/sbin/service lockd forcestop and pwait for /usr/sbin/rpc.lockd to terminate.


40083 ?? I 0:00.03 | |-- sh -c (/usr/sbin/service lockd forcestop) 2>&1 | logger -p daemon.notice -t notifier
40084 ?? I 0:00.04 | | |-- /bin/sh /etc/rc.d/lockd forcestop
40201 ?? I 0:00.00 | | | `-- pwait 2993
40085 ?? I 0:00.00 | | `-- logger -p daemon.notice -t notifier

So for whatever reason, the original lockd refuses to die, I wonder if this caused problems. Is this a bug? Is it safe to manually restart lockd service?

Thank you.
 
Status
Not open for further replies.
Top