Jail hangs then hangs GUI and SSH of NAS

Status
Not open for further replies.

keithg

Explorer
Joined
May 15, 2013
Messages
92
I have been exercising my NAS recently. I have a 9.1 standard jail and am running abcde in it to rip some audio CDs. Had a couple issues with the cd drive but finally got it all going. I can rip no problem.

What happens is the jail will hang when I have SSH'd into it. It appear to be unrecoverable. I cannot restart the jail from the GUI because the left pane of the GUI will not fill in with anything. If I select Jails and try to restart it from the right pane, it just says 'loading' and nothing happens. I do not know what the trigger is, bbut it is sometimes a mistyped command. The scary part is that I have to hard power down it to get it to reboot... My Jail is a standard BSD jail and I am using a 9.1 template. When it hangs, i can also not ssh into the nas. I start the SSH handshake, it responds with 'password'. I type it in and it hangs. Totally unrecoverable. My NFS mounts are still being served. I have a couple raspberry pis with nfs mounted root partitons. I cannot ssh into them either. I do not know how to get the log to try to figure it out as I have to hard power down to get it back up. Please help
 

keithg

Explorer
Joined
May 15, 2013
Messages
92
Dell Precision 490 Zeon, 6 gb ram, 3 - 2Tb drives as a ZFS pool. Dmesg is attached. I only saw one thing that was new/odd and I believe it is to do with the CD drive I just put in it. I have run this system without any issues at all for about 2 years. started on Version 8.x and have upgraded it through 9.1, 9.2 and now 9.3 When it hangs, I cannot get to it to get a log file as it is headless. I guess I could pull a monitor and keyboard for it for when it happens next time.


It seems to happen when I have a SSH session open to the jail and also have a file browser open to the same directory in the pool (Music directory) shared as NFS to my Linux machine. I have the nas IP address controlled from my DHCP server as x.198. The Jails are 201, 202, 203 and there are no conflicting ips according to the router

EDIT: It just did it again. I tried to run a command in teh jail and got a permission error (sudo -u nobody abcde) in my home directory in the jail. It errored out as nobody is not allowed. I cd'd to the correct directory ran the command again and then it hung. It will not load any jail info in the GUI, but I already had a reboot button in the left window, so I tried to reboot it from the GUI. Tried to reboot from GUI and it stuck and will not actually reboot.

Keith
 

Attachments

  • dmesg.txt
    11.6 KB · Views: 284
Last edited:
D

dlavigne

Guest
Can you add more RAM to the system? The minimum requirement for 9.3 is now 8GB.
 
D

dlavigne

Guest
Yup. Let us know whether or not the additional RAM resolves it.
 

keithg

Explorer
Joined
May 15, 2013
Messages
92
I ordered another 8gb of RAM, but it is not here yet. In the meantime I was still ripping CDs and think I see what is triggering it (Maybe RAM will fix it, but it seems unlikely to me. I am no expert)

If I am in the mounted ZFS pool directory (I do not remember what is it called from the GUI, but it is the pool directory shared with the jail set up from the GUI) as myuser:myuser and try to copy or move a file from that mounted directory (owned as nobody:nobody) back to my user's root directory (/home/myuser) in the jail, the process hangs. Something has changed (maybe with the last 2 updates) as I can now ssh into the jail even though the GUI shows 'loading' and never completes. When ssh'd into the jail, I can su to root. From there, I can see hung processes which cannot be killed with killall or kill -9. and can move the file (as root). It seems to be a permissions issue. At this point, I can no longer ssh into the NAS but can to the jail (?).

My Jail user is a user of the NAS
In the jail, my jail user is in the nobody group

# ps -aux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
kgrider 77601 99.2 0.1 43412 7528 1- R+J 7:18PM 879:00.38 mc
root 4089 0.0 0.0 12052 1840 ?? IsJ Fri08PM 0:00.00 dhclient: epair0b [priv] (dhclient)
_dhcp 4128 0.0 0.0 12052 1896 ?? IsJ Fri08PM 0:00.00 dhclient: epair0b (dhclient)
root 4370 0.0 0.0 12052 1808 ?? SsJ Fri08PM 0:00.78 /usr/sbin/syslogd -s
root 4445 0.0 0.1 46744 4868 ?? IsJ Fri08PM 0:00.01 /usr/sbin/sshd
root 4449 0.0 0.0 14128 1824 ?? IsJ Fri08PM 0:00.25 /usr/sbin/cron -s
root 25416 0.0 0.1 67884 5628 ?? IsJ 9:30AM 0:00.02 sshd: kgrider [priv] (sshd)
kgrider 25422 0.0 0.1 67884 5608 ?? SJ 9:30AM 0:00.01 sshd: kgrider@pts/4 (sshd)
kgrider 25170 0.0 0.0 3784 1456 0- D+J 9:26AM 0:00.00 mv eject /home/kgrider/
kgrider 25423 0.0 0.1 17452 3404 4 IsJ 9:30AM 0:00.01 -bash (bash)
root 26089 0.0 0.0 41168 2272 4 IJ 9:37AM 0:00.01 su
root 26090 0.0 0.1 17532 4080 4 RJ 9:37AM 0:00.02 _su (csh)
root 28126 0.0 0.0 14188 2040 4 R+J 9:58AM 0:00.00 ps -aux
kgrider 77602 0.0 0.1 17452 3400 2 Is+J 7:18PM 0:00.01 bash -rcfile .bashrc
root 78594 0.0 0.1 17532 3840 3- IJ 7:20PM 0:00.04 _su (csh)
root 79337 0.0 0.0 3784 1460 3- D+J 7:25PM 0:00.00 mv eject /home/kgrider/

The 'mc' and 'mv' are the 2 commands which are hung and cannot be killed. Both were attempts to move a file from the ZFS pool directory to my jail home directory. Upon execution of those commands it froze and I killed the terminal window and logged back in from fresh terminal window. There doesn't appear to be anything in the DMESG about it.

As there no apparent way to reboot the nas from the GUI (no reboot buttons are visible), the only way It seems I can safely reboot it is to apply whatever updates to freeNAS are available and allow it to reboot.

(EDIT: It looked like it would reboot, but it failed to do so saying on the console that there are processes which needed to be killed, though there was no way for me to kill the processes. I have a monitor and keyboard on the NAS, now, but I could not enter any commands. I had to hard power it off again. Memory will be here next week and I'll try the same thing again and see if there is a difference in behavior)

Keith
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
At the bottom of the left column in the Web GUI I have Reboot and Shutdown buttons. Do you not see those?
 

keithg

Explorer
Joined
May 15, 2013
Messages
92
Nope. All I have in the left is a white pane with nothing in it. This happens when it hangs. The web interface cannot render that pane.
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Then if you can't SSH in and use the 'reboot' command, and you can't get to the console menu and choose option 13, I guess you're a bit stuck.
 
Status
Not open for further replies.
Top